An empirical dynamic model of mortgage default in Colombia between

An empirical dynamic model of mortgage default in Colombia between 1997 and 2004

Juan Esteban Carranza Salvador Navarro Abstract We estimate a dynamic model of default for a cohort of Colombian debtors between 1997 and 2004, which was a period of unprecedented nancial stress in Colombia. We develop a methodological framework based on a fully dynamic behavioral model that accounts for both cross-sectional and aggregate heterogeneity. Specically we control for the unobserved heterogeneity using non-matching survey data and a dynamic aggregate shock that can be inferred directly from the observed aggregate behavior. Results indicate that the dynamic structure and the unobserved heterogeneity are crucial for identifying correctly the impact of dierent factors on default behavior. We also point out that the methodology's applicability goes beyond the estimation of default models.

1

Introduction

The standard approach for estimating mortgage default risk is the use of duration models as described in Deng, Quigley, and Van Order (2000). The advantage of such approach over standard multinomial choice models is that it allows the estimation of default risk even when the default rates are very low. Nevertheless, the estimates obtained from such models have

1

no clear connection to a behavioral model and therefore its use for policy analysis is limited, in the sense that the estimates are not structural" and therefore may not be stable across counterfactual equilibria. Moreover, both duration and standard multinomial choice models do not fully account for the dynamic incentives of debtors and may therefore yield misleading estimates of the

1 See

Cunha, Heckman, and Navarro (2007) 1

underlying determinants of debtor behavior.

For example, if home prices are expected to

keep on rising over time they may suggest that the debtors' default behavior is unresponsive to home prices, even if that's not true. In this paper, we construct a dynamic model of mortgage default and estimate it using micro-level Colombian data spanning the years between 1998 and 2004. During this time, mortgage default rates in Colombia were unusually high due to an unprecedented economic downturn that was accompanied by a dramatic fall in home prices.

The extent to which

the fall in household incomes and the fall in home prices contributed separately to the unprecedented rates of default is a relevant policy question that can be answered with the proposed estimation.[We'll mention something more specic about the results here.] As usual, the diculties of estimating the dynamic model of default stem from the need of allowing a rich pattern of unobserved heterogeneity. We develop a technique for estimating the model that allows the presence of an unobserved state variable that is correlated across debtors and over time. Specically, we use simulation methods to control for the unobserved heterogeneity of debtors that is constant over time the distribution of which, as shown in Heckman and Navarro (2007), is identied nonparametrically. In addition, using ideas from the empirical IO literature, we develop a methodology that allows us to incorporate an aggregate shock that is correlated over time. Both the value of the aggregate shock and its transition can be recovered nonparametrically. The paper proceeds as follows: in the next section we describe our theory of mortgage default using a dynamic decision model and propose an estimation strategy.

Then, we

describe the data and the details of the estimation, including results.

2 2.1

An empirical model of mortgage default The individual model of default

We study the behavior of mortgage holders (debtors") who live in the mortgaged piece of real estate (home"). The utility that a debtor

i gets from the home depends on a measure qi,t

of subjective home quality. It also depends on the dierence between household income and mortgage payments

yi,t −mi,t and on an idiosyncratic preference shock εi,t which incorporates

unobserved (to the econometrician) variables that may aect default, e.g. home attributes that are only valued by its owner and other preference shocks that vary across consumers and

2

time. The estimation of the model is ultimately based on the properties of these unobserved variables. Let

Wi,t

denote the value for individual

i

of defaulting on her mortgage at time

t.

This

value is the result of a complex scenario. Specically, the individual may be waiting to see whether the following period she can pay back her dues; she may try to sell the home and cash the dierence between price and loan balance; she may let the bank take over the property to cover her obligation; nally, she could also just stop making payments indenitely and face forfeiture or a renegotiation with the bank. The resulting value of default

Wi,t

in the weighted sum of payos across the random

scenarios just described. For simplicity, we write

Wi,t

as a reduced form

Wi,t = W π ¯i,t , bi,t , yi,t , εw i,t , where

π ¯i,t

is the expected price of the home at time

the debtor's income and

εw i,t

t, bi,t

(1)

is the balance of the debt,

yi,t

is

are other unobserved (to the econometrician) attributes. These

are variables that enter directly the payos of the individual scenarios arising after a default decision as discussed above. A debtor will choose to default on her mortgage if the utility of making the mortgage payments and consuming the housing services from her home is lower than the utility obtained by not making the loan payment at the time. That is, if we let

Vli ,t

be the value for

a debtor with a mortgage of term length li of not defaulting on her loan at time

t,

we can

∗ = max u(qi,t , yi,t − mi,t , εui,t ) + βEVl,t+1 , W π ¯i,t , bi,t , yi,t , εw for t < Ti i,t o n , = max u(qi,t , yi,Ti∗ − mi,Ti∗ , εui,Ti∗ ), W π ¯i,Ti∗ , bi,Ti∗ , yi,Ti∗ , εw i,Ti∗

(2)

recursively write the value of the decision problem as follows:

Vli ,t Vli ,Ti∗ where

u(.)

is the instant payo from consuming the home in the current period and

Ti∗

is

the last period of the mortgage. In order to x ideas, we assume that the payo of default

Wi,t

is a linear function of

relevant states:

Wi,t = ω0 + ω1 yi,t + ω2 π ¯i,t + ω3 bi,t + εw i,t .

(3)

This payo is a linear combination of payos across random outcomes. If these payos are linear functions of states, then the linear payo function

Wi,t

should be stable across coun-

terfactual equilibria, as long as states don't aect the probabilities of individual outcomes. Notice that a careful interpretation of the function

3

Wi,t

is important because the usefulness

of the model for counterfactual analysis relies on the assumption that this function will not change when we change the values or the transition probabilities of the state variables. We also assume that the static utility is additively separable on observable and unobservable states:

u(qi,t , yi,t − mi,t , εi,t ) = θ0 + γqi + α(yi,t − mi,t ) + εui,t . Then, it follows from (2), (3) and (4) that a debtor

(4)

i will choose to continue paying her dues

if:

θ0 + γqi,t + α(yi,t − mi,t ) + εui,t + βEt Vli ,t+1

(5)

≥ ω0 + ω1 yi,t + ω2 π ¯i,t + ω3 bi,t + εw i,t ,

where the continuation payo will depend on all observed and unobserved state variables, as described below. Since no home attributes are observed in our application, we will further assume that the unobserved quality" of homes

qi,t

is random:

qi,t ≡ κ + εqi,t , where

εqi,t

(6)

is a random variable that is potentially correlated over time and across debtors.

Any systematic dierence in the subjective home quality across debtors will be captured by the correlation structure of the error which will be described in the estimation section below. In our data set we have no information on the required payments

mi,t

of each debtor.

However, it is known that the required payments are linear functions of mortgage balances

bi,t

and remaining term

Li,t ,

with some random variation across debtors:

mi,t = ρ0 + ρ1 bi,t + ρ2 Li,t + εm i,t . where

εm i,t

is an error term.

Group the unobserved components into one error term

Ωi,t = {¯ πi,t , yi,t , bi,t , Li,t , ε¯i,t } Ωi,t

(7)

w u ε¯i,t ≡ γεqi,t −αεm i,t +εi,t −εi,t and let

be the vector of observed and unobserved states. Assume that

follows a rst order Markov process so that

E[Ωi,t+1 |Ωi,t ] = Ψ(Ωi,t ).

By substituting (3),

(4), (6) and (7) in (2), we obtain the value of the debtor's problem at each point in time as function of variables that can be mapped to the data and of unobserved random variables. For mortgages that have reached their last period

Ti∗ ,

the value function is given by:

Vli ,Ti∗ (Ωi,Ti∗ ) = max 0, ζ0 + ζ1 π ¯i,Ti∗ + ζ2 yi,Ti∗ + ζ3 bi,Ti∗ + ζ4 Li,Ti∗ + ε¯i,Ti∗ 4

(8)

whereas for

t < Ti∗ ,

this value is:

Vli ,t (Ωi,t ) = max {0, ζ0 + ζ1 π ¯i,t + ζ2 yi,t + ζ3 bi,t + ζ4 Li,t + ε¯i,t + βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ]} where the parameters to be estimated underlying structural parameters.

ζ = {ζ0 , ζ1 , ζ2 , ζ3 , ζ4 }

(9)

are linear combinations of the

Notice that this function can be computed recursively

starting from the last period if all the state variables and their transition are known. In order to allow for a rich pattern of choice behavior, the unobserved state

ε¯i,t

will be

decomposed as follows:

ε¯i,t = ξt + µi + i,t where the term

ξt

(10)

is a common aggregate unobserved shock,

unobservable state and

i,t

is an

iid

µi

idiosyncratic disturbance.

is an individual-specic This specication allows

individual choices to be correlated over time and across debtors; in addition, this unobserved heterogeneity can be allowed to depend on other observed states such as income or debt balances which would be equivalent to a model with heterogenous Let

Di,t = 1 be the event that debtor i does not

ζ

coecients.

default at time t. Assume that the unob-

served components of utility are normalized so that

i,t

is a logit shock that is conditionally

independent of the observed states. Under these assumptions, the model above generates the following non -default probability for debtor on the mortgage up to

i

at time

t

conditional on not having defaulted

t − 1:

Pi,t (Ωi,t ; ζ, li ) = Pr (Di,t = 1|Ωi,t ) =

eζ0 +ζ1 π¯i,t +ζ2 yi,t +ζ3 bi,t +ζ4 Li,t +ξt +µi +βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ] . 1 + eζ0 +ζ1 π¯i,t +ζ2 yi,t +ζ3 bi,t +ζ4 Li,t +ξt +µi +βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ]

(11)

Notice that the choice probabilities (11) are identical to the choice probabilities in standard Markovian decision processes as described by, for example, Rust (1987). The presence of the unobserved correlated states

ξt

and

µi

2

complicates the estimation of the model . In

the following subsection, we describe the strategy we follow to estimate the model.

2.2

Estimation

Consider estimating the model above using debtor-level data on mortgage balances, mortgage terms and home prices over a set of

t = 1, ...T

2 In

time periods.

The objective is to obtain

addition, in our application below no matching information is available on the income of individual debtors, so that instead non-matching survey data has to be used. 5

estimates of

ζ

by matching the predictions of the model to the observed data. As indicated

above, there are two diculties associated with estimating the model using the predicted choice probabilities in (11). The rst diculty arises because the Colombian mortgage data we use does not contain matching income data tracing the evolution of income for individual debtors. We address this problem using household survey data containing information on debtors' income and mortgage holdings. Specically, we treat income as an unobserved state with distribution given by the empirical distribution contained in the survey. The second diculty is not particular to our application since it arises from the need to compute of the value functions (8) and (9) along the estimation algorithm, due to the presence of the unobserved states

µ

and

ξ.

This presents a bigger challenge since the computation of

(8) and (9) depends crucially on both the levels and the transitions of since the distribution of the individual-level persistent heterogeneity

µ and ξ .

µi

On one hand,

is nonparametrically

3

identied , it can be simulated from any assumed parametric distribution. Moreover, this distribution can be assumed to be correlated with any observable characteristic. On the other hand, the aggregate randomness

ξt and its transition can be inferred directly

conditioning on all remaining parameters and equating the observed and predicted aggregate default behavior. This idea is similar to the estimation algorithm described by Berry (1994) and Berry, Levinsohn, and Pakes (1995) to infer the value of the value of the unobserved attributes of goods from aggregate market shares. Specically, let

GYt (Y |K)

be the distribution of income conditional on the value of col-

lateral, which is contained in the household surveys. bution of Let

µ

(e.g.

Let

Gµ (σ)

normal) which depends on the parameter

Xi,t = (¯ πi,t , bi,t , Li,t )

σ

be the parametric distri-

that has to be estimated.

denote the observed states. Given a transition for

Xi,t

(which can

be inferred directly from the data), then for any realization of the aggregate shocks (and their implied transition) and any choice of parameters unconditional non-default probability of debtor version of (11) over the distribution of

P¯i,t (ξt , Xi,t ; ζ 0 , σ 0 , li ) =

´ Qt−1

τ =1

Y

and

i

{ζ 0 , σ 0 }

at time

t

we can obtain the expected

by integrating the unconditional

µ:

Pi,τ (Xi,τ , ξτ , Y, µ; ζ 0 , li )Pi,t (Xi,t , ξt , Y, µ; ζ 0 , li )dGYt (Y |K)dGµ (µ; σ0 ) ´ Qt−1 Y 0 µ τ =1 Pi,τ (Xi,τ , ξτ , Y, µ; ζ , li )dGt (Y |K)dG (µ; σ0 ) (12)

where the integral is taken with respect to the empirical conditional distribution of income

3 See

Heckman and Navarro (2007) for a proof. 6

and the assumed parametric distribution of the unobserved individual heterogeneity Moreover, for any vector of parameters

{ζ 0 , σ 0 }

the model generates an aggregate non -

default probability for each time period that depends only on the aggregate shock

P¯t (ξt ; ζ 0 , σ 0 ) =

µ.

ξ:

X 1 P¯i,t (Xi,t , ξt ; ζ 0 , σ 0 , li ) size(Nt ) i∈N

(13)

t

where the average is taken over the set

Nt

of debtors with outstanding debts at time

t.

Equating the predicted and the observed aggregate non -default probabilities generates a system of

T

non-linear equations. From this set of equations we can infer the values of the

unobserved aggregate shocks. Solving for the vector of unobserved shocks

ξ = {ξ1 , ...ξT } has

to be done using numerical techniques and is not easy, due to the fact that its transition is not known. To compute the value of the vector any vector

0

0

{ζ , σ }

ξ = ξ(ζ 0 , σ 0 )

that is consistent with the data and

of model parameters, x the vector of parameters

{ζ, σ} = {ζ 0 , σ 0 }

at

any arbitrary value. Set also the vector of unobserved aggregate shocks equal to some initial value (e.g.

ξ o = 0) and estimate its transition.

{ζ 0 , σ 0 } and ξ o ,

Given

continuation payos from (9) for a set of simulated draws of

T

compute the expected

{µ1 , ...µS } and {Y1 , ..., YS } over

periods:

V¯lio,t

Ωoi,t

h i 0 +ζ 0 y +ζ 0 b +ζ 0 L +ξ o +µ +βE V o o o ¯i,t ζ00 +ζ10 π i t 2 i,t 3 i,t 4 i,t l ,t+1 (Ωi,t+1 )|Ωi,t

= log 1 + e

(14)

i

where the expectation is taken over the logit shock the vector of states containing the initialized values

ξ

i,t o

and

Ωoi,t = {¯ πi,t , bi,t , Li,t , ξto , Y, µ}

is

. This function can be computed using

a xed point algorithm given some bounded transition for the observed states. Given the values

V¯i,to (Ωoi,t ) and {ζ 0 , σ 0 }, we can solve for the vector ξ 0

of implied aggregate

shocks that solves the equality of predicted and observed aggregate choice probabilities

st = P¯t (ξ 0 ; ζ 0 , σ 0 ) ξ0

This equality generates a new vector model to be consistent we need algorithm and iterate on

ξ

0

ξ =ξ

o

(st ): (15)

and a new transition associated with it.

For the

. In the application below we construct a xed point

(and its transition) in (15) until

ξ0

converges to

ξ ∗ = ξ ∗ (ζ 0 , σ 0 ),

so that for any choice of parameters, a corresponding set of aggregate shocks can be found. Notice that other solving methods may be more appropriate depending on the properties of the problem.

7

With the values of the aggregate shocks,

ξ ∗ (ζ o , σ o ),

lihood of the sample for any choice of parameters

at hand we can compute the like-

{ζ 0 , σ 0 },

which is the product across

debtors of individual default/non-default histories, integrated over the distribution of the unobservables:

`(ζ 0 , σ 0 ) =

" Yˆ Y

Pi,t Ω∗i,t ; ζ o , li

Di,t

1 − Pi,t Ω∗i,t ; ζ o , li

(1−Di,t )

# dGYt (Y |K)dGµ (µ; σ0 )

t

i∈Nt

(16)

∗ where Ωi,t is the vector of states containing the aggregate shocks

ξ

∗

obtained using the

algorithm described above. Estimates of

ζ

and

σ

can therefore be obtained by nding the values that maximize (16).

The idea that aggregate xed eects can be concentrated out from the estimation algorithm taking advantage of the identity of of the predicted and observed aggregate behavior has sel-

4

dom been used in contexts dierent than the usual BLP-style estimation . To our knowledge, it has never been used in the context of a dynamic programming problem.

3

Data and estimation results

3.1

Description of the data

The estimation of the model above is based on two separate data sets. The rst (or main") data set contains information on 16000 random mortgages that were outstanding between 1997 and 2002.

The monthly payment history of each mortgage, its original and current

value and term of the mortgaged home are included. A secondary" data set contains nonmatching individual-level demographic data, including income and real estate holdings. The total number of loans contained in the main data set is 16000. Nevertheless, this set of mortgages includes loans that started at dierent points in time, most of them before 1997. From this subset of loans that started before 1997 we only observe those that survived until 1997.

Since our model predicts that loan survival is endogenous, for the estimation

below we selected the cohort of loans started during the year 1997 and assumed that the distribution of unobserved attributes of new debtors is the same throughout that year. After eliminating from our sample those loans with incomplete or inconsistent payment histories, we ended with a total of 925 loans which are observed from the time they start in 1997 until

4 For

an exception, see Bayer, Ferreira, and McMillan (2007). 8

5

2004 . Despite the reduction in the number of mortgages in our data set, we end up with a panel with 14250 observations. The data set contains only the price of each home at the time the loan started as reported by the bank. These prices are very reliable because banks are very serious about the value of the collateral. The expected prices of individual homes at any point in time

P¯it

are updated

using housing price indices constructed by the Colombian Central Bank. In addition, all data is aggregated into quarters, so that default observations are not confounded with missed payments or coding errors.

All variables are expressed in constant 1997 real Colombian

pesos. Since this main data set contains no information on the income of debtors over the span of the sample, survey data from the secondary data set was used to control for the changing distribution of income. This data set is part of an annual survey conducted by DANE that contains large samples of individual household demographics. We selected households in the sample who reported having a home loan. We use the reported income and matching housing payments to simulate the joint distribution of income and the other state variables. In the data it is observed that sometimes debtors temporarily stop making their payments. Therefore what 'default' means has to be dened. Specically in the estimation below, loans that accumulate past due payments of more than 3 months are assumed to be defaulted and are dropped o from the data set. Therefore, 'default' is dened as the event in which the number of past due payments in a loan history changes from 3 or less to more than 3

6

between two quarters. After a loan is dened to be defaulted, it is dropped o the sample . Table 1 contains some summary statistics of the main data set, which goes from the rst

7

quarter of 1997 to the second quarter of 2004 . The number of loans in the data set increases during the rst four quarters of 1997 as new loans are initiated until reaching 925 which is the total number of loans in the cohort. Notice from column (3) that the number of number of non-defaulted loans decreases gradually over time which is a reection of the high numbers of defaults observed in the sample. The default rate, dened as the number of defaults over

5 For

a detailed study of the default behavior observed in the whole sample using a simpler empirical model see Carranza and Estrada (2007). 6 The default rate based on this denition is highly correlated with default rates based on longer default periods. The 3-month threshold was chosen in order to observe as much default as possible and in order to capture all defaulted loans, including those that are terminated soon after default. 7 Since default is inferred from the change in the number of past due mortgage payments, no default is reported during the rst period of the sample. 9

the total number of outstanding loans in column (4), reaches a level higher than 7% during the fourth quarter of 1999, which is indicative of the severity of the market collapse. By the end of the sample more than half of the loans in the sample were defaulted. The to give a sense of the characteristics of the defaulted loans we computed the average price of homes with outstanding loans (column (5)) and the average price of all homes in the sample (column (6)). Notice that up until the middle of 1999, the average price of homes with outstanding mortgages was higher than the average price of the homes of all the loans in the sample which implied that defaults tended to occur among the mortgages of the least expensive homes. After 1999 the price of homes with outstanding loans was lower than the average price of all homes in the sample, which implied that it was among mortgages of the more expensive homes where defaults were concentrated. The second and third columns of Table 2 characterize the observed correlations contained in the main data set. Specically, a simple logit model of non-default was estimated using the denition of default described above.

The use of the logit model will facilitate the

comparison with the estimates of the structural model.

Dependent variables include the

mortgage balance, the expected price of the collateral and the remaining term of the loan at each point in time, which are the relevant states in the model described above. As expected, non-default is negatively correlated with the balance of mortgages at any point in time and with their remaining term. The loose negative correlation of non-default and home prices is not consistent with a model of rational behavior, in the sense that debtors will have less incentives to default, the higher the price of the collateral.

In fact, we would expect this relationship top be

positive. The problem is that it should be expected that home prices be correlated with unobserved state variables. For example, home prices should be correlated with the unobserved household income. Besides the rich modelling of the structural error in our model, we use the secondary data set to account for the unobserved variation in individual incomes. As we describe in detail below, we construct a simulated data set with ten simulated draws for each observation. We will use the simulated data to integrate out part of the unobserved heterogeneity. The draws are taken from the corresponding quintile of the distribution of income ordered according to the monthly mortgage payments which is assumed to match the distribution of income conditional on the ratio of balance to remaining term. The fourth and fth columns of Table 2 report the results of the same logit regression

10

described above using the simulated data set. As can be seen, the coecient of balances is more negative, whereas the coecient of remaining term doesn't change much. According to the logit regression, the income draws have a very low positive and sharp correlation with non-default, which is not surprising. More importantly, the inclusion of income in the regression causes the price coecient to become positive and signicant, as was expected. In addition to the simulated income, the regression includes xed time eects and the initial loan to value (LTV) which is regarded as a crucial indicator of the risk attitude of debtors. As expected, non-default is negatively correlated with this ratio. The estimates of the time eects (not shown), which are measured with respect to the constant in the second quarter of 1997, are mostly signicant which suggests that there are other time-changing unobserved factors that aect default. Even though the results of these regression estimates are somewhat consistent with the described behavioral model, they are only descriptive.

Nevertheless, the signicant corre-

lations described above are the basis for the econometric identication of the structural model.

3.2

Computation of the model.

As indicated above, the estimation of the model is based on a simulated data set that allows the computation of the integrals in (12). Specically, for each mortgage

Si

income draws

{Yst }s=1,...Si

i

at time

t,

a set of

is simulated from the corresponding quintile of the empirical

distribution of income conditional on the monthly mortgage payments, contained in the secondary" data set. In addition,

Si

draws of the random shock

µs

are generated from a

normal distribution:

µ ∼ N (0, σ 2 )

(17)

We assume that this randomness is correlated with the initial loan-to-value ratio of each loan, which as said before is regarded as a good predictor of the risk attitude of debtors. Specically, we assume that this underlying correlation is determined by the following loading equation:

LT Vi = α0 + α1 µs + νs where

νs ∼ N (0, α22 )

(18)

is a normal error with mean zero and variance

this equation and the variance of the error

α = {α0 , α1 , α2 }

other parameters of the model.

11

α22 .

The coecients of

are estimated jointly with the

The computation of the likelihood of individual default/non-default observations requires in addition the computation of the value functions (14). Such computation requires rst the estimation of the transition of the state variables that are changing over time. We estimate directly the transition of the observed states and the simulated income. Consistently with the Markovian structure of the model, we assume that they follow independent rst-order autoregressive processes:

bit+1 = ρb0 + ρb1 bit + ωitb

(19)

πit+1 = ρπ0 + ρπ1 πit + ωitπ y yst+1 = ρy0 + ρy1 yst + ωst The main diculty in computing the value functions is posed by the unobserved aggregate shocks

ξ,

which aect the computation both through its levels and its transition. Instead of

estimating them along the estimation algorithm, we solve directly for the values of

ξ

and its

transition from the equality of observed and predicted aggregate default behavior as given by (13). Specically, we solve for the values of

ξ

and use them to estimate its transition,

which is assumed to follow an autoregressive process:

ξt+1 = ρξ1 ξt + ρ2 ωtξ

(20)

Finding these values is a non-trivial numerical exercise because the transition itself enters the computation of the value functions, which are computed for each simulated observation at each point in time. We solve for them using a xed point algorithm similar to the one described in Berry (1994) and BLP:

ξ 0 = ξ + log(st ) − log(P¯t (., ρξ )) where

st

and

P¯t

(21)

are the observed and predicted average non-default rates. The xed point of

this mapping is the vector of aggregate shocks associated with the given vector of parameters

ξ(ζ 0 , σ 0 ). of

For each iteration of the xed point algorithm the parameters

ρξ

of the transition

ξ. Given the vector

{ζ 0 , σ 0 } of parameters and any guess of ξ

and its transition, we compute

the value functions recursively starting from the last period of the mortgage over a grid of states and then interpolate to nd the values for specic realizations of the state vector. We perform these computations in the xed point mapping (21) to nd

ξ ∗ = ξ(ζ 0 , σ 0 ).

For

specic realizations of the states, the conditional non-default probabilities are computed

12

using (11). The expected unconditional non-default probability of each observation is given by (12), which is computed using Montecarlo techniques using the simulated draws described above as follows:

Pˆ¯i,t (ξt , Xi,t ; ζ 0 , σ 0 , li ) =

1 Si

Si Q P t−1 s=1

τ =1

Ps,τ (Xi,τ , ξτ , Ys,t , µs ; ζ 0 , li )Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li ) 1 Si

Si Q P t−1

τ =1

s=1

Ps,τ (Xi,τ , ξτ , Ys,t , µs ; ζ 0 , li ) (22)

These probabilities are in turn used to compute a new likelihood function, which in addition to (16) accounts for the likelihood of the error in (18):

Si Y 1 X `(ζ , σ , α0 ) = Si s=1 i∈N 0

0

t

" Y

Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li )Di,t

# 1 − Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li )(1−Di,t ) φ(νs )

(23)

t where

φ(.)

is the normal density function

ν

with variance

α2 .

We nd estimates of

{ζ, σ, α}

by minimizing (23) numerically.

3.3

Results

Estimates are found by maximizing (15) over the space of parameters estimates of the aggregate shocks and its transition parameters the maximization routine as described above.

{ξ, ρ}

{ζ, σ}.

In addition,

are found as part of

Results of the estimation are displayed in

Table 3. We estimate four versions of the model. Models I and II arestatic" models in which the discount rate is assumed to be zero, so that consumers disregard the continuation payos of their non-default decisions (i.e.

the

option value of defaulting in the future). Model I has no income variables whereas Model II does. Notice that these models are equivalent to a duration model, in the sense that they account for the survival probabilities but ignore the dynamic incentives of debtors. Models III and IV are dynamic models with discount rate income variation whereas model IV does.

β = 0.97;

model III has no controls for

The inclusion and non-inclusion of the income

controls allows us to its eect on the results of the estimation. The most interesting feature of the results is the eect of including the several controls for the unobserved heterogeneity. As reported earlier, the raw data indicated a positive correlation between home prices and non-default. After controlling for unobserved heterogeneity

13

such negative relationship disappears. In the case of the duration model, the inclusion of only controls for the unobserved risk attitude of debtors and ignoring the unobserved income (Model I) generates a positive price coecient on the non-default probability. In the case of the dynamic model, the inclusion of controls for only the unobserved risk attitude of debtors (Model III) still yields a negative price coecient, which turns positive once controls for the unobserved variation of income are included (Model IV). The estimates of

α in all models indicate that a large unobserved taste" for non-default,

which would be consistent with a relatively high aversion to risk, is negatively correlated with the initial loan-to-value ratio, as expected.

Notice, nally, that the estimates of

ρ

generated by the dynamic model imply that the aggregate component of the unobserved state is correlated over time with a signicant variance. Consequently, debtors have strong incentives to account for the continuation payos of their default decisions.

3.4

Final remark

The dynamic model of default described above was estimated with a methodology that accounts for a very rich structure of unobserved heterogeneity. Specically, it incorporates individual-level heterogeneity using both survey and simulated data; it also incorporates aggregate time-varying heterogeneity, which is inferred directly from the observed aggregate behavior. The standard techniques for estimating dynamic structural models have limited applicability due to diculties associated with incorporating correlated unobserved states. In that sense, the applicability of our methodology goes beyond the estimation of default models. It can be used to estimate dynamic structural models in environments with both micro-level and aggregate data.

14

References Bayer, P., F. Ferreira,

and R. McMillan (2007):

A Unied Framework for Measuring

Preferences for Schools and Neighborhoods, Journal of Political Economy, 115(4), 588 638. Berry, S. (1994): Estimating Discrete Choice Models of Product Dierentiation, RAND

Journal of Economics, 25, 242262. Berry, S., J. Levinsohn,

and

A. Pakes (1995): Automobile Prices in Market Equilib-

rium, Econometrica, 60(4), 889917. Carranza, J. E.,

and

D. Estrada (2007): An empirical characterization of mortgage

default in Colombia between 1997 and 2004, Unpublished manuscript, University of Wisconsin-Madison, Department of Economics. Cunha, F., J. J. Heckman,

and

S. Navarro (2007): The Identication and Economic

Content of Ordered Choice Models with Stochastic Cutos, International Economic Re-

view, 48(4), 1273 1309. Deng, Y., J. M. Quigley,

and R. Van Order (2000):

Mortgage Terminations, Hetero-

geneity and the Exercise of Mortgage Options, Econometrica, 68(2), 275307. Heckman, J. J.,

and S. Navarro (2007):

Dynamic Discrete Choice and Dynamic Treat-

ment Eects, Journal of Econometrics, 136(2), 341396. Rust, J. (1987):

Optimal Replacement of GMC Bus Engines:

Harold Zurcher, Econometrica, 55(5), 9991033.

15

An Empirical Model of

Table 1: Summary statistics (main data set)

(1)

(2)

(3)

(4)

(5)

(6)

(7)

Quarter

Number

Outstanding

Default

Mean

Mean

Price/

of loans

loans

rate

Price 1

Price 2

Balance

1997 : 1

93

93

0.00 %

167.98

167.9828

53.23 %

1997 : 2

355

351

1.14 %

85.69

85.3543

47.17 %

1997 : 3

591

575

2.09 %

87.28

86.3226

47.25 %

1997 : 4

925

892

1.91 %

85.12

84.0201

46.01 %

1998 : 1

925

856

4.21 %

91.18

88.4451

44.96 %

1998 : 2

925

831

3.01 %

95.70

91.6633

43.60 %

1998 : 3

925

810

2.59 %

95.11

90.2188

45.65 %

1998 : 4

925

788

2.79 %

95.70

89.3045

47.57 %

1999 : 1

925

750

5.07 %

100.14

91.3159

48.65 %

1999 : 2

925

704

6.53 %

94.14

91.0233

49.66 %

1999 : 3

925

680

3.53 %

77.67

85.8669

51.58 %

1999 : 4

925

634

7.26 %

61.55

92.029

48.44 %

2000 : 1

925

598

6.02 %

59.76

87.9514

49.14 %

2000 : 2

925

586

2.05 %

65.43

96.0334

43.04 %

2000 : 3

925

555

5.59 %

58.92

94.8815

44.00 %

2000 : 4

925

539

2.97 %

59.74

95.4666

42.74 %

Continues in next page

Prices and balances are in 1997 COL$ Mean Price 1 and Mean Price 2 are computed over outstanding and all loans, respectively.

16

Table 1, continued

(1)

(2)

(3)

(4)

(5)

(6)

(7)

2001 : 1

925

526

2.47 %

67.15

107.0776

37.51 %

2001 : 2

925

513

2.53 %

61.34

97.2037

42.04 %

2001 : 3

925

502

2.19 %

66.04

104.0606

39.06 %

2001 : 4

925

491

2.24 %

69.75

108.6502

36.62 %

2002 : 1

925

489

0.41 %

63.46

98.703

39.29 %

2002 : 2

925

483

1.24 %

71.43

110.7895

34.48 %

2002 : 3

925

473

2.11 %

66.99

103.5303

35.78 %

2002 : 4

925

462

2.38 %

76.25

117.7744

30.71 %

2003 : 1

925

456

1.32 %

70.26

108.3027

32.00 %

2003 : 2

925

453

0.66 %

73.77

113.6786

30.25 %

2003 : 3

925

450

0.67 %

72.92

112.0695

29.47 %

2003 : 4

925

448

0.45 %

73.87

114.9951

27.67 %

2004 : 1

925

444

0.90 %

72.45

113.0203

27.33 %

2004 : 2

925

439

1.14 %

80.93

125.7102

23.91 %

Prices and balances are in 1997 COL$ Mean Price 1 and Mean Price 2 are computed over outstanding and all loans, respectively.

Table 2: Logit Regressions

Variable

Est.

t-stat

Est

t-stat

Constant

4.284

21.91

4.881

86.65

Balance

-0.005

-3.09

-0.011

-27.78

Price

-0.000007

-1.28

0.004

22.03

Term

-0.016

-4.39

-0.016

-31.78

Income

0.00000024

-8.32

LTV

-0.542

-15.47

Income was simulated from its distribution conditional on mortgage payments. The regression on the right hand side include also time eects.

17

Table 3: Estimation results Coecient

Model I

Model II

Model III

Model IV

γ1 (Price)

0.006635663

0.00672481

-0.026195844

0.008152264

γ2 (Income)

-0.000351462

-8.35113E-05

γ3 (Balance)

-0.01799785

-0.017566723

-0.065647672

-0.019551768

α0

0.48276361

0.482771227

0.482953878

0.482819736

α1

-0.235081032

-0.236802085

-0.079400546

-0.049066327

α2

0.032110149

0.032192441

0.031584769

0.035797393

σ2

0.077317205

0.07473171

0.437034046

0.154595017

ρ1

0.991803576

0.997144916

ρ2

0.487002961

0.208507686

18