In this paper, we construct a dynamic model of mortgage default and ... a debtor with a mortgage of term length li of not defaulting on her loan at time t, we can.
An empirical dynamic model of mortgage default in Colombia between 1997 and 2004
Juan Esteban Carranza Salvador Navarro Abstract We estimate a dynamic model of default for a cohort of Colombian debtors between 1997 and 2004, which was a period of unprecedented nancial stress in Colombia. We develop a methodological framework based on a fully dynamic behavioral model that accounts for both cross-sectional and aggregate heterogeneity. Specically we control for the unobserved heterogeneity using non-matching survey data and a dynamic aggregate shock that can be inferred directly from the observed aggregate behavior. Results indicate that the dynamic structure and the unobserved heterogeneity are crucial for identifying correctly the impact of dierent factors on default behavior. We also point out that the methodology's applicability goes beyond the estimation of default models.
1
Introduction
The standard approach for estimating mortgage default risk is the use of duration models as described in Deng, Quigley, and Van Order (2000). The advantage of such approach over standard multinomial choice models is that it allows the estimation of default risk even when the default rates are very low. Nevertheless, the estimates obtained from such models have
1
no clear connection to a behavioral model and therefore its use for policy analysis is limited, in the sense that the estimates are not structural" and therefore may not be stable across counterfactual equilibria. Moreover, both duration and standard multinomial choice models do not fully account for the dynamic incentives of debtors and may therefore yield misleading estimates of the
1 See
Cunha, Heckman, and Navarro (2007) 1
underlying determinants of debtor behavior.
For example, if home prices are expected to
keep on rising over time they may suggest that the debtors' default behavior is unresponsive to home prices, even if that's not true. In this paper, we construct a dynamic model of mortgage default and estimate it using micro-level Colombian data spanning the years between 1998 and 2004. During this time, mortgage default rates in Colombia were unusually high due to an unprecedented economic downturn that was accompanied by a dramatic fall in home prices.
The extent to which
the fall in household incomes and the fall in home prices contributed separately to the unprecedented rates of default is a relevant policy question that can be answered with the proposed estimation.[We'll mention something more specic about the results here.] As usual, the diculties of estimating the dynamic model of default stem from the need of allowing a rich pattern of unobserved heterogeneity. We develop a technique for estimating the model that allows the presence of an unobserved state variable that is correlated across debtors and over time. Specically, we use simulation methods to control for the unobserved heterogeneity of debtors that is constant over time the distribution of which, as shown in Heckman and Navarro (2007), is identied nonparametrically. In addition, using ideas from the empirical IO literature, we develop a methodology that allows us to incorporate an aggregate shock that is correlated over time. Both the value of the aggregate shock and its transition can be recovered nonparametrically. The paper proceeds as follows: in the next section we describe our theory of mortgage default using a dynamic decision model and propose an estimation strategy.
Then, we
describe the data and the details of the estimation, including results.
2 2.1
An empirical model of mortgage default The individual model of default
We study the behavior of mortgage holders (debtors") who live in the mortgaged piece of real estate (home"). The utility that a debtor
i gets from the home depends on a measure qi,t
of subjective home quality. It also depends on the dierence between household income and mortgage payments
yi,t −mi,t and on an idiosyncratic preference shock εi,t which incorporates
unobserved (to the econometrician) variables that may aect default, e.g. home attributes that are only valued by its owner and other preference shocks that vary across consumers and
2
time. The estimation of the model is ultimately based on the properties of these unobserved variables. Let
Wi,t
denote the value for individual
i
of defaulting on her mortgage at time
t.
This
value is the result of a complex scenario. Specically, the individual may be waiting to see whether the following period she can pay back her dues; she may try to sell the home and cash the dierence between price and loan balance; she may let the bank take over the property to cover her obligation; nally, she could also just stop making payments indenitely and face forfeiture or a renegotiation with the bank. The resulting value of default
Wi,t
in the weighted sum of payos across the random
scenarios just described. For simplicity, we write
Wi,t
as a reduced form
Wi,t = W π ¯i,t , bi,t , yi,t , εw i,t , where
π ¯i,t
is the expected price of the home at time
the debtor's income and
εw i,t
t, bi,t
(1)
is the balance of the debt,
yi,t
is
are other unobserved (to the econometrician) attributes. These
are variables that enter directly the payos of the individual scenarios arising after a default decision as discussed above. A debtor will choose to default on her mortgage if the utility of making the mortgage payments and consuming the housing services from her home is lower than the utility obtained by not making the loan payment at the time. That is, if we let
Vli ,t
be the value for
a debtor with a mortgage of term length li of not defaulting on her loan at time
t,
we can
∗ = max u(qi,t , yi,t − mi,t , εui,t ) + βEVl,t+1 , W π ¯i,t , bi,t , yi,t , εw for t < Ti i,t o n , = max u(qi,t , yi,Ti∗ − mi,Ti∗ , εui,Ti∗ ), W π ¯i,Ti∗ , bi,Ti∗ , yi,Ti∗ , εw i,Ti∗
(2)
recursively write the value of the decision problem as follows:
Vli ,t Vli ,Ti∗ where
u(.)
is the instant payo from consuming the home in the current period and
Ti∗
is
the last period of the mortgage. In order to x ideas, we assume that the payo of default
Wi,t
is a linear function of
relevant states:
Wi,t = ω0 + ω1 yi,t + ω2 π ¯i,t + ω3 bi,t + εw i,t .
(3)
This payo is a linear combination of payos across random outcomes. If these payos are linear functions of states, then the linear payo function
Wi,t
should be stable across coun-
terfactual equilibria, as long as states don't aect the probabilities of individual outcomes. Notice that a careful interpretation of the function
3
Wi,t
is important because the usefulness
of the model for counterfactual analysis relies on the assumption that this function will not change when we change the values or the transition probabilities of the state variables. We also assume that the static utility is additively separable on observable and unobservable states:
u(qi,t , yi,t − mi,t , εi,t ) = θ0 + γqi + α(yi,t − mi,t ) + εui,t . Then, it follows from (2), (3) and (4) that a debtor
(4)
i will choose to continue paying her dues
if:
θ0 + γqi,t + α(yi,t − mi,t ) + εui,t + βEt Vli ,t+1
(5)
≥ ω0 + ω1 yi,t + ω2 π ¯i,t + ω3 bi,t + εw i,t ,
where the continuation payo will depend on all observed and unobserved state variables, as described below. Since no home attributes are observed in our application, we will further assume that the unobserved quality" of homes
qi,t
is random:
qi,t ≡ κ + εqi,t , where
εqi,t
(6)
is a random variable that is potentially correlated over time and across debtors.
Any systematic dierence in the subjective home quality across debtors will be captured by the correlation structure of the error which will be described in the estimation section below. In our data set we have no information on the required payments
mi,t
of each debtor.
However, it is known that the required payments are linear functions of mortgage balances
bi,t
and remaining term
Li,t ,
with some random variation across debtors:
mi,t = ρ0 + ρ1 bi,t + ρ2 Li,t + εm i,t . where
εm i,t
is an error term.
Group the unobserved components into one error term
Ωi,t = {¯ πi,t , yi,t , bi,t , Li,t , ε¯i,t } Ωi,t
(7)
w u ε¯i,t ≡ γεqi,t −αεm i,t +εi,t −εi,t and let
be the vector of observed and unobserved states. Assume that
follows a rst order Markov process so that
E[Ωi,t+1 |Ωi,t ] = Ψ(Ωi,t ).
By substituting (3),
(4), (6) and (7) in (2), we obtain the value of the debtor's problem at each point in time as function of variables that can be mapped to the data and of unobserved random variables. For mortgages that have reached their last period
Ti∗ ,
the value function is given by:
Vli ,Ti∗ (Ωi,Ti∗ ) = max 0, ζ0 + ζ1 π ¯i,Ti∗ + ζ2 yi,Ti∗ + ζ3 bi,Ti∗ + ζ4 Li,Ti∗ + ε¯i,Ti∗ 4
(8)
whereas for
t < Ti∗ ,
this value is:
Vli ,t (Ωi,t ) = max {0, ζ0 + ζ1 π ¯i,t + ζ2 yi,t + ζ3 bi,t + ζ4 Li,t + ε¯i,t + βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ]} where the parameters to be estimated underlying structural parameters.
ζ = {ζ0 , ζ1 , ζ2 , ζ3 , ζ4 }
(9)
are linear combinations of the
Notice that this function can be computed recursively
starting from the last period if all the state variables and their transition are known. In order to allow for a rich pattern of choice behavior, the unobserved state
ε¯i,t
will be
decomposed as follows:
ε¯i,t = ξt + µi + i,t where the term
ξt
(10)
is a common aggregate unobserved shock,
unobservable state and
i,t
is an
iid
µi
idiosyncratic disturbance.
is an individual-specic This specication allows
individual choices to be correlated over time and across debtors; in addition, this unobserved heterogeneity can be allowed to depend on other observed states such as income or debt balances which would be equivalent to a model with heterogenous Let
Di,t = 1 be the event that debtor i does not
ζ
coecients.
default at time t. Assume that the unob-
served components of utility are normalized so that
i,t
is a logit shock that is conditionally
independent of the observed states. Under these assumptions, the model above generates the following non -default probability for debtor on the mortgage up to
i
at time
t
conditional on not having defaulted
t − 1:
Pi,t (Ωi,t ; ζ, li ) = Pr (Di,t = 1|Ωi,t ) =
eζ0 +ζ1 π¯i,t +ζ2 yi,t +ζ3 bi,t +ζ4 Li,t +ξt +µi +βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ] . 1 + eζ0 +ζ1 π¯i,t +ζ2 yi,t +ζ3 bi,t +ζ4 Li,t +ξt +µi +βE [Vli ,t+1 (Ωi,t+1 )|Ωi,t ]
(11)
Notice that the choice probabilities (11) are identical to the choice probabilities in standard Markovian decision processes as described by, for example, Rust (1987). The presence of the unobserved correlated states
ξt
and
µi
2
complicates the estimation of the model . In
the following subsection, we describe the strategy we follow to estimate the model.
2.2
Estimation
Consider estimating the model above using debtor-level data on mortgage balances, mortgage terms and home prices over a set of
t = 1, ...T
2 In
time periods.
The objective is to obtain
addition, in our application below no matching information is available on the income of individual debtors, so that instead non-matching survey data has to be used. 5
estimates of
ζ
by matching the predictions of the model to the observed data. As indicated
above, there are two diculties associated with estimating the model using the predicted choice probabilities in (11). The rst diculty arises because the Colombian mortgage data we use does not contain matching income data tracing the evolution of income for individual debtors. We address this problem using household survey data containing information on debtors' income and mortgage holdings. Specically, we treat income as an unobserved state with distribution given by the empirical distribution contained in the survey. The second diculty is not particular to our application since it arises from the need to compute of the value functions (8) and (9) along the estimation algorithm, due to the presence of the unobserved states
µ
and
ξ.
This presents a bigger challenge since the computation of
(8) and (9) depends crucially on both the levels and the transitions of since the distribution of the individual-level persistent heterogeneity
µ and ξ .
µi
On one hand,
is nonparametrically
3
identied , it can be simulated from any assumed parametric distribution. Moreover, this distribution can be assumed to be correlated with any observable characteristic. On the other hand, the aggregate randomness
ξt and its transition can be inferred directly
conditioning on all remaining parameters and equating the observed and predicted aggregate default behavior. This idea is similar to the estimation algorithm described by Berry (1994) and Berry, Levinsohn, and Pakes (1995) to infer the value of the value of the unobserved attributes of goods from aggregate market shares. Specically, let
GYt (Y |K)
be the distribution of income conditional on the value of col-
lateral, which is contained in the household surveys. bution of Let
µ
(e.g.
Let
Gµ (σ)
normal) which depends on the parameter
Xi,t = (¯ πi,t , bi,t , Li,t )
σ
be the parametric distri-
that has to be estimated.
denote the observed states. Given a transition for
Xi,t
(which can
be inferred directly from the data), then for any realization of the aggregate shocks (and their implied transition) and any choice of parameters unconditional non-default probability of debtor version of (11) over the distribution of
P¯i,t (ξt , Xi,t ; ζ 0 , σ 0 , li ) =
´ Qt−1
τ =1
Y
and
i
{ζ 0 , σ 0 }
at time
t
we can obtain the expected
by integrating the unconditional
µ:
Pi,τ (Xi,τ , ξτ , Y, µ; ζ 0 , li )Pi,t (Xi,t , ξt , Y, µ; ζ 0 , li )dGYt (Y |K)dGµ (µ; σ0 ) ´ Qt−1 Y 0 µ τ =1 Pi,τ (Xi,τ , ξτ , Y, µ; ζ , li )dGt (Y |K)dG (µ; σ0 ) (12)
where the integral is taken with respect to the empirical conditional distribution of income
3 See
Heckman and Navarro (2007) for a proof. 6
and the assumed parametric distribution of the unobserved individual heterogeneity Moreover, for any vector of parameters
{ζ 0 , σ 0 }
the model generates an aggregate non -
default probability for each time period that depends only on the aggregate shock
P¯t (ξt ; ζ 0 , σ 0 ) =
µ.
ξ:
X 1 P¯i,t (Xi,t , ξt ; ζ 0 , σ 0 , li ) size(Nt ) i∈N
(13)
t
where the average is taken over the set
Nt
of debtors with outstanding debts at time
t.
Equating the predicted and the observed aggregate non -default probabilities generates a system of
T
non-linear equations. From this set of equations we can infer the values of the
unobserved aggregate shocks. Solving for the vector of unobserved shocks
ξ = {ξ1 , ...ξT } has
to be done using numerical techniques and is not easy, due to the fact that its transition is not known. To compute the value of the vector any vector
0
0
{ζ , σ }
ξ = ξ(ζ 0 , σ 0 )
that is consistent with the data and
of model parameters, x the vector of parameters
{ζ, σ} = {ζ 0 , σ 0 }
at
any arbitrary value. Set also the vector of unobserved aggregate shocks equal to some initial value (e.g.
ξ o = 0) and estimate its transition.
{ζ 0 , σ 0 } and ξ o ,
Given
continuation payos from (9) for a set of simulated draws of
T
compute the expected
{µ1 , ...µS } and {Y1 , ..., YS } over
periods:
V¯lio,t
Ωoi,t
h i 0 +ζ 0 y +ζ 0 b +ζ 0 L +ξ o +µ +βE V o o o ¯i,t ζ00 +ζ10 π i t 2 i,t 3 i,t 4 i,t l ,t+1 (Ωi,t+1 )|Ωi,t
= log 1 + e
(14)
i
where the expectation is taken over the logit shock the vector of states containing the initialized values
ξ
i,t o
and
Ωoi,t = {¯ πi,t , bi,t , Li,t , ξto , Y, µ}
is
. This function can be computed using
a xed point algorithm given some bounded transition for the observed states. Given the values
V¯i,to (Ωoi,t ) and {ζ 0 , σ 0 }, we can solve for the vector ξ 0
of implied aggregate
shocks that solves the equality of predicted and observed aggregate choice probabilities
st = P¯t (ξ 0 ; ζ 0 , σ 0 ) ξ0
This equality generates a new vector model to be consistent we need algorithm and iterate on
ξ
0
ξ =ξ
o
(st ): (15)
and a new transition associated with it.
For the
. In the application below we construct a xed point
(and its transition) in (15) until
ξ0
converges to
ξ ∗ = ξ ∗ (ζ 0 , σ 0 ),
so that for any choice of parameters, a corresponding set of aggregate shocks can be found. Notice that other solving methods may be more appropriate depending on the properties of the problem.
7
With the values of the aggregate shocks,
ξ ∗ (ζ o , σ o ),
lihood of the sample for any choice of parameters
at hand we can compute the like-
{ζ 0 , σ 0 },
which is the product across
debtors of individual default/non-default histories, integrated over the distribution of the unobservables:
`(ζ 0 , σ 0 ) =
" Yˆ Y
Pi,t Ω∗i,t ; ζ o , li
Di,t
1 − Pi,t Ω∗i,t ; ζ o , li
(1−Di,t )
# dGYt (Y |K)dGµ (µ; σ0 )
t
i∈Nt
(16)
∗ where Ωi,t is the vector of states containing the aggregate shocks
ξ
∗
obtained using the
algorithm described above. Estimates of
ζ
and
σ
can therefore be obtained by nding the values that maximize (16).
The idea that aggregate xed eects can be concentrated out from the estimation algorithm taking advantage of the identity of of the predicted and observed aggregate behavior has sel-
4
dom been used in contexts dierent than the usual BLP-style estimation . To our knowledge, it has never been used in the context of a dynamic programming problem.
3
Data and estimation results
3.1
Description of the data
The estimation of the model above is based on two separate data sets. The rst (or main") data set contains information on 16000 random mortgages that were outstanding between 1997 and 2002.
The monthly payment history of each mortgage, its original and current
value and term of the mortgaged home are included. A secondary" data set contains nonmatching individual-level demographic data, including income and real estate holdings. The total number of loans contained in the main data set is 16000. Nevertheless, this set of mortgages includes loans that started at dierent points in time, most of them before 1997. From this subset of loans that started before 1997 we only observe those that survived until 1997.
Since our model predicts that loan survival is endogenous, for the estimation
below we selected the cohort of loans started during the year 1997 and assumed that the distribution of unobserved attributes of new debtors is the same throughout that year. After eliminating from our sample those loans with incomplete or inconsistent payment histories, we ended with a total of 925 loans which are observed from the time they start in 1997 until
4 For
an exception, see Bayer, Ferreira, and McMillan (2007). 8
5
2004 . Despite the reduction in the number of mortgages in our data set, we end up with a panel with 14250 observations. The data set contains only the price of each home at the time the loan started as reported by the bank. These prices are very reliable because banks are very serious about the value of the collateral. The expected prices of individual homes at any point in time
P¯it
are updated
using housing price indices constructed by the Colombian Central Bank. In addition, all data is aggregated into quarters, so that default observations are not confounded with missed payments or coding errors.
All variables are expressed in constant 1997 real Colombian
pesos. Since this main data set contains no information on the income of debtors over the span of the sample, survey data from the secondary data set was used to control for the changing distribution of income. This data set is part of an annual survey conducted by DANE that contains large samples of individual household demographics. We selected households in the sample who reported having a home loan. We use the reported income and matching housing payments to simulate the joint distribution of income and the other state variables. In the data it is observed that sometimes debtors temporarily stop making their payments. Therefore what 'default' means has to be dened. Specically in the estimation below, loans that accumulate past due payments of more than 3 months are assumed to be defaulted and are dropped o from the data set. Therefore, 'default' is dened as the event in which the number of past due payments in a loan history changes from 3 or less to more than 3
6
between two quarters. After a loan is dened to be defaulted, it is dropped o the sample . Table 1 contains some summary statistics of the main data set, which goes from the rst
7
quarter of 1997 to the second quarter of 2004 . The number of loans in the data set increases during the rst four quarters of 1997 as new loans are initiated until reaching 925 which is the total number of loans in the cohort. Notice from column (3) that the number of number of non-defaulted loans decreases gradually over time which is a reection of the high numbers of defaults observed in the sample. The default rate, dened as the number of defaults over
5 For
a detailed study of the default behavior observed in the whole sample using a simpler empirical model see Carranza and Estrada (2007). 6 The default rate based on this denition is highly correlated with default rates based on longer default periods. The 3-month threshold was chosen in order to observe as much default as possible and in order to capture all defaulted loans, including those that are terminated soon after default. 7 Since default is inferred from the change in the number of past due mortgage payments, no default is reported during the rst period of the sample. 9
the total number of outstanding loans in column (4), reaches a level higher than 7% during the fourth quarter of 1999, which is indicative of the severity of the market collapse. By the end of the sample more than half of the loans in the sample were defaulted. The to give a sense of the characteristics of the defaulted loans we computed the average price of homes with outstanding loans (column (5)) and the average price of all homes in the sample (column (6)). Notice that up until the middle of 1999, the average price of homes with outstanding mortgages was higher than the average price of the homes of all the loans in the sample which implied that defaults tended to occur among the mortgages of the least expensive homes. After 1999 the price of homes with outstanding loans was lower than the average price of all homes in the sample, which implied that it was among mortgages of the more expensive homes where defaults were concentrated. The second and third columns of Table 2 characterize the observed correlations contained in the main data set. Specically, a simple logit model of non-default was estimated using the denition of default described above.
The use of the logit model will facilitate the
comparison with the estimates of the structural model.
Dependent variables include the
mortgage balance, the expected price of the collateral and the remaining term of the loan at each point in time, which are the relevant states in the model described above. As expected, non-default is negatively correlated with the balance of mortgages at any point in time and with their remaining term. The loose negative correlation of non-default and home prices is not consistent with a model of rational behavior, in the sense that debtors will have less incentives to default, the higher the price of the collateral.
In fact, we would expect this relationship top be
positive. The problem is that it should be expected that home prices be correlated with unobserved state variables. For example, home prices should be correlated with the unobserved household income. Besides the rich modelling of the structural error in our model, we use the secondary data set to account for the unobserved variation in individual incomes. As we describe in detail below, we construct a simulated data set with ten simulated draws for each observation. We will use the simulated data to integrate out part of the unobserved heterogeneity. The draws are taken from the corresponding quintile of the distribution of income ordered according to the monthly mortgage payments which is assumed to match the distribution of income conditional on the ratio of balance to remaining term. The fourth and fth columns of Table 2 report the results of the same logit regression
10
described above using the simulated data set. As can be seen, the coecient of balances is more negative, whereas the coecient of remaining term doesn't change much. According to the logit regression, the income draws have a very low positive and sharp correlation with non-default, which is not surprising. More importantly, the inclusion of income in the regression causes the price coecient to become positive and signicant, as was expected. In addition to the simulated income, the regression includes xed time eects and the initial loan to value (LTV) which is regarded as a crucial indicator of the risk attitude of debtors. As expected, non-default is negatively correlated with this ratio. The estimates of the time eects (not shown), which are measured with respect to the constant in the second quarter of 1997, are mostly signicant which suggests that there are other time-changing unobserved factors that aect default. Even though the results of these regression estimates are somewhat consistent with the described behavioral model, they are only descriptive.
Nevertheless, the signicant corre-
lations described above are the basis for the econometric identication of the structural model.
3.2
Computation of the model.
As indicated above, the estimation of the model is based on a simulated data set that allows the computation of the integrals in (12). Specically, for each mortgage
Si
income draws
{Yst }s=1,...Si
i
at time
t,
a set of
is simulated from the corresponding quintile of the empirical
distribution of income conditional on the monthly mortgage payments, contained in the secondary" data set. In addition,
Si
draws of the random shock
µs
are generated from a
normal distribution:
µ ∼ N (0, σ 2 )
(17)
We assume that this randomness is correlated with the initial loan-to-value ratio of each loan, which as said before is regarded as a good predictor of the risk attitude of debtors. Specically, we assume that this underlying correlation is determined by the following loading equation:
LT Vi = α0 + α1 µs + νs where
νs ∼ N (0, α22 )
(18)
is a normal error with mean zero and variance
this equation and the variance of the error
α = {α0 , α1 , α2 }
other parameters of the model.
11
α22 .
The coecients of
are estimated jointly with the
The computation of the likelihood of individual default/non-default observations requires in addition the computation of the value functions (14). Such computation requires rst the estimation of the transition of the state variables that are changing over time. We estimate directly the transition of the observed states and the simulated income. Consistently with the Markovian structure of the model, we assume that they follow independent rst-order autoregressive processes:
bit+1 = ρb0 + ρb1 bit + ωitb
(19)
πit+1 = ρπ0 + ρπ1 πit + ωitπ y yst+1 = ρy0 + ρy1 yst + ωst The main diculty in computing the value functions is posed by the unobserved aggregate shocks
ξ,
which aect the computation both through its levels and its transition. Instead of
estimating them along the estimation algorithm, we solve directly for the values of
ξ
and its
transition from the equality of observed and predicted aggregate default behavior as given by (13). Specically, we solve for the values of
ξ
and use them to estimate its transition,
which is assumed to follow an autoregressive process:
ξt+1 = ρξ1 ξt + ρ2 ωtξ
(20)
Finding these values is a non-trivial numerical exercise because the transition itself enters the computation of the value functions, which are computed for each simulated observation at each point in time. We solve for them using a xed point algorithm similar to the one described in Berry (1994) and BLP:
ξ 0 = ξ + log(st ) − log(P¯t (., ρξ )) where
st
and
P¯t
(21)
are the observed and predicted average non-default rates. The xed point of
this mapping is the vector of aggregate shocks associated with the given vector of parameters
ξ(ζ 0 , σ 0 ). of
For each iteration of the xed point algorithm the parameters
ρξ
of the transition
ξ. Given the vector
{ζ 0 , σ 0 } of parameters and any guess of ξ
and its transition, we compute
the value functions recursively starting from the last period of the mortgage over a grid of states and then interpolate to nd the values for specic realizations of the state vector. We perform these computations in the xed point mapping (21) to nd
ξ ∗ = ξ(ζ 0 , σ 0 ).
For
specic realizations of the states, the conditional non-default probabilities are computed
12
using (11). The expected unconditional non-default probability of each observation is given by (12), which is computed using Montecarlo techniques using the simulated draws described above as follows:
Pˆ¯i,t (ξt , Xi,t ; ζ 0 , σ 0 , li ) =
1 Si
Si Q P t−1 s=1
τ =1
Ps,τ (Xi,τ , ξτ , Ys,t , µs ; ζ 0 , li )Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li ) 1 Si
Si Q P t−1
τ =1
s=1
Ps,τ (Xi,τ , ξτ , Ys,t , µs ; ζ 0 , li ) (22)
These probabilities are in turn used to compute a new likelihood function, which in addition to (16) accounts for the likelihood of the error in (18):
Si Y 1 X `(ζ , σ , α0 ) = Si s=1 i∈N 0
0
t
" Y
Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li )Di,t
# 1 − Ps,t (Xi,t , ξt , Ys,t , µs ; ζ 0 , li )(1−Di,t ) φ(νs )
(23)
t where
φ(.)
is the normal density function
ν
with variance
α2 .
We nd estimates of
{ζ, σ, α}
by minimizing (23) numerically.
3.3
Results
Estimates are found by maximizing (15) over the space of parameters estimates of the aggregate shocks and its transition parameters the maximization routine as described above.
{ξ, ρ}
{ζ, σ}.
In addition,
are found as part of
Results of the estimation are displayed in
Table 3. We estimate four versions of the model. Models I and II arestatic" models in which the discount rate is assumed to be zero, so that consumers disregard the continuation payos of their non-default decisions (i.e.
the
option value of defaulting in the future). Model I has no income variables whereas Model II does. Notice that these models are equivalent to a duration model, in the sense that they account for the survival probabilities but ignore the dynamic incentives of debtors. Models III and IV are dynamic models with discount rate income variation whereas model IV does.
β = 0.97;
model III has no controls for
The inclusion and non-inclusion of the income
controls allows us to its eect on the results of the estimation. The most interesting feature of the results is the eect of including the several controls for the unobserved heterogeneity. As reported earlier, the raw data indicated a positive correlation between home prices and non-default. After controlling for unobserved heterogeneity
13
such negative relationship disappears. In the case of the duration model, the inclusion of only controls for the unobserved risk attitude of debtors and ignoring the unobserved income (Model I) generates a positive price coecient on the non-default probability. In the case of the dynamic model, the inclusion of controls for only the unobserved risk attitude of debtors (Model III) still yields a negative price coecient, which turns positive once controls for the unobserved variation of income are included (Model IV). The estimates of
α in all models indicate that a large unobserved taste" for non-default,
which would be consistent with a relatively high aversion to risk, is negatively correlated with the initial loan-to-value ratio, as expected.
Notice, nally, that the estimates of
ρ
generated by the dynamic model imply that the aggregate component of the unobserved state is correlated over time with a signicant variance. Consequently, debtors have strong incentives to account for the continuation payos of their default decisions.
3.4
Final remark
The dynamic model of default described above was estimated with a methodology that accounts for a very rich structure of unobserved heterogeneity. Specically, it incorporates individual-level heterogeneity using both survey and simulated data; it also incorporates aggregate time-varying heterogeneity, which is inferred directly from the observed aggregate behavior. The standard techniques for estimating dynamic structural models have limited applicability due to diculties associated with incorporating correlated unobserved states. In that sense, the applicability of our methodology goes beyond the estimation of default models. It can be used to estimate dynamic structural models in environments with both micro-level and aggregate data.
14
References Bayer, P., F. Ferreira,
and R. McMillan (2007):
A Unied Framework for Measuring
Preferences for Schools and Neighborhoods, Journal of Political Economy, 115(4), 588 638. Berry, S. (1994): Estimating Discrete Choice Models of Product Dierentiation, RAND
Journal of Economics, 25, 242262. Berry, S., J. Levinsohn,
and
A. Pakes (1995): Automobile Prices in Market Equilib-
rium, Econometrica, 60(4), 889917. Carranza, J. E.,
and
D. Estrada (2007): An empirical characterization of mortgage
default in Colombia between 1997 and 2004, Unpublished manuscript, University of Wisconsin-Madison, Department of Economics. Cunha, F., J. J. Heckman,
and
S. Navarro (2007): The Identication and Economic
Content of Ordered Choice Models with Stochastic Cutos, International Economic Re-
view, 48(4), 1273 1309. Deng, Y., J. M. Quigley,
and R. Van Order (2000):
Mortgage Terminations, Hetero-
geneity and the Exercise of Mortgage Options, Econometrica, 68(2), 275307. Heckman, J. J.,
and S. Navarro (2007):
Dynamic Discrete Choice and Dynamic Treat-
ment Eects, Journal of Econometrics, 136(2), 341396. Rust, J. (1987):
Optimal Replacement of GMC Bus Engines:
Harold Zurcher, Econometrica, 55(5), 9991033.
15
An Empirical Model of
Table 1: Summary statistics (main data set)
(1)
(2)
(3)
(4)
(5)
(6)
(7)
Quarter
Number
Outstanding
Default
Mean
Mean
Price/
of loans
loans
rate
Price 1
Price 2
Balance
1997 : 1
93
93
0.00 %
167.98
167.9828
53.23 %
1997 : 2
355
351
1.14 %
85.69
85.3543
47.17 %
1997 : 3
591
575
2.09 %
87.28
86.3226
47.25 %
1997 : 4
925
892
1.91 %
85.12
84.0201
46.01 %
1998 : 1
925
856
4.21 %
91.18
88.4451
44.96 %
1998 : 2
925
831
3.01 %
95.70
91.6633
43.60 %
1998 : 3
925
810
2.59 %
95.11
90.2188
45.65 %
1998 : 4
925
788
2.79 %
95.70
89.3045
47.57 %
1999 : 1
925
750
5.07 %
100.14
91.3159
48.65 %
1999 : 2
925
704
6.53 %
94.14
91.0233
49.66 %
1999 : 3
925
680
3.53 %
77.67
85.8669
51.58 %
1999 : 4
925
634
7.26 %
61.55
92.029
48.44 %
2000 : 1
925
598
6.02 %
59.76
87.9514
49.14 %
2000 : 2
925
586
2.05 %
65.43
96.0334
43.04 %
2000 : 3
925
555
5.59 %
58.92
94.8815
44.00 %
2000 : 4
925
539
2.97 %
59.74
95.4666
42.74 %
Continues in next page
Prices and balances are in 1997 COL$ Mean Price 1 and Mean Price 2 are computed over outstanding and all loans, respectively.
16
Table 1, continued
(1)
(2)
(3)
(4)
(5)
(6)
(7)
2001 : 1
925
526
2.47 %
67.15
107.0776
37.51 %
2001 : 2
925
513
2.53 %
61.34
97.2037
42.04 %
2001 : 3
925
502
2.19 %
66.04
104.0606
39.06 %
2001 : 4
925
491
2.24 %
69.75
108.6502
36.62 %
2002 : 1
925
489
0.41 %
63.46
98.703
39.29 %
2002 : 2
925
483
1.24 %
71.43
110.7895
34.48 %
2002 : 3
925
473
2.11 %
66.99
103.5303
35.78 %
2002 : 4
925
462
2.38 %
76.25
117.7744
30.71 %
2003 : 1
925
456
1.32 %
70.26
108.3027
32.00 %
2003 : 2
925
453
0.66 %
73.77
113.6786
30.25 %
2003 : 3
925
450
0.67 %
72.92
112.0695
29.47 %
2003 : 4
925
448
0.45 %
73.87
114.9951
27.67 %
2004 : 1
925
444
0.90 %
72.45
113.0203
27.33 %
2004 : 2
925
439
1.14 %
80.93
125.7102
23.91 %
Prices and balances are in 1997 COL$ Mean Price 1 and Mean Price 2 are computed over outstanding and all loans, respectively.
Table 2: Logit Regressions
Variable
Est.
t-stat
Est
t-stat
Constant
4.284
21.91
4.881
86.65
Balance
-0.005
-3.09
-0.011
-27.78
Price
-0.000007
-1.28
0.004
22.03
Term
-0.016
-4.39
-0.016
-31.78
Income
0.00000024
-8.32
LTV
-0.542
-15.47
Income was simulated from its distribution conditional on mortgage payments. The regression on the right hand side include also time eects.
17
Table 3: Estimation results Coecient
Model I
Model II
Model III
Model IV
γ1 (Price)
0.006635663
0.00672481
-0.026195844
0.008152264
γ2 (Income)
-0.000351462
-8.35113E-05
γ3 (Balance)
-0.01799785
-0.017566723
-0.065647672
-0.019551768
α0
0.48276361
0.482771227
0.482953878
0.482819736
α1
-0.235081032
-0.236802085
-0.079400546
-0.049066327
α2
0.032110149
0.032192441
0.031584769
0.035797393
σ2
0.077317205
0.07473171
0.437034046
0.154595017
ρ1
0.991803576
0.997144916
ρ2
0.487002961
0.208507686
18