Combining systemic and non-systemic risk scores

Journal of the Operational Research Society (2013), 1–12

© 2013 Operational Research Society Ltd. All rights reserved. 0160-5682/13 www.palgrave-journals.com/jors/

Combining systemic and non-systemic risk scores Robert M Oliver* University of California, Berkeley, CA, USA This paper proposes a proportional odds model to combine systemic and non-systemic risk for prediction of default and prepay performance in cohorts of booked loan accounts. We assume that performance odds is proportional to two independent factors, one based on age-dependent systemic, possibly external, global disruptions to a cohort of individual accounts, the other on traditional non-systemic information odds based on demographic, behavioural and financial payment patterns of the individual accounts. A proportional odds model provides a natural formulation that can combine hazard rate predictions of baseline defaults, prepayments and active accounts with traditional non-systemic risk scores of individuals within the cohort. Theoretical comparisons with proportional hazard models are illustrated. Although our model is developed in terms of Good/Bad performance, it can include late payments, prepayments, defaults, as well as responses to offers and other classifications. We make 60-month default and prepay forecasts under two different systemic risk scenarios for a portfolio of Alt A mortgages with 24-month ‘teaser rates’ originated in 2004. Journal of the Operational Research Society advance online publication, 18 December 2013; doi:10.1057/jors.2013.134 Keywords: systemic risk; non-systemic risk; hazard rates; information odds; default and prepay scores; credit risk

1. Introduction Banks and other financial institutions use credit risk scores to help them in making acquisition and account management decisions for personal, auto and mortgage loans as well as credit cards. Simple cutoff rules that focus primarily on risk have been augmented by acquisition decisions that use many risk- and response-based pricing models to affect profitability and control risk which, typically, are both affected by systemic and non-systemic risk. Once loan accounts have been acquired, financial institutions often measure performance of borrower accounts as well as the performance of an entire vintage or portfolio. Homogeneous segments with the same acquisition date, loan type, risk profile, APR, financial product, are often identified and separately tracked as a single cohort which means one can tally cumulative monthly counts of numbers of defaults and prepayments as well as the number and condition of accounts that remain in an active (current) status. Vintage analyses are often used to show trends in charge-offs, delinquencies, defaults, losses, and prepayments as a function of calendar time or age of each cohort. With the passage of time, new economic conditions or disruptions, not included in the original assessments, may influence future risk; thus, it is critical to examine how new information and macroeconomic disruptions can play a part alongside the traditional non-systemic risk scoring methods that are used in making loan pricing and acquisition decisions. New

*Correspondence: Robert M Oliver, University of California at Berkeley, 260 Southampton Ave, Berkeley, CA 94707-2039, USA.

predictions resulting from these analyses can be used to anticipate performance of future acquisitions of the same or possibly different financial products with different appeal to the borrower, profitability to the lender and regulatory capital and flow requirements. It has been noted by Lo (2012) that protections designed to cope with relatively stable and low-frequency non-systemic risk may not be able to cope with high-frequency disruptions usually associated with systemic risk. Capital adequacy rules based on examinations of historical performance outcomes under stationary conditions are ill-equipped to provide meaningful estimates of safety when systemic risks provide unfamiliar shocks and large disruptions to transaction and flow processes; furthermore, the response of regulators to unanticipated events may be ad hoc or uninformed about controls that magnify rather than reduce risk. During periods when market conditions are in a steady state, the dominant risk may be non-systemic but when there are major disruptions in financial markets with new economic conditions, products and policies, systemic risks often dominate. Depending on the operational situation and the length of the forecast horizon, changes in the systemic risk component may be more important than the relative risk of the individual. This is no criticism of the immense contribution made by technological developments in estimating traditional credit risk scores but, rather, recognition of the obvious fact that nonstationary dislocations can easily overwhelm the predictions and protections afforded by their ‘steady-state’ counterparts. Even though estimates of precise timing and scale of disruptive shocks may be difficult, one should not equate

2

Journal of the Operational Research Society

time- or age-dependent systemic risk with unpredictable risk. Many of the US mortgage-based securities in 2004–2009 had repeatable default and prepayment patterns that were the direct result of ‘teaser rate’ loan pricing products. Although disruptive prepayment patterns were identified as early as 2003, many financial institutions encouraged use of such products by also assuming borrowers could be easily refinanced. What was missing was any attempt to model or predict how total risk was conserved independently of packaging, how defaults and prepayments were dynamically interrelated over time or how new product designs or sudden macroeconomic changes in interest rates or house prices might influence refinancing, prepay and default patterns. An immense concentration of highly leveraged non-systemic risk, ultimately, led to the creation of systemic risk (see Ferguson, 2012). Our proposed model assesses performance by explicitly assuming that the systemic and non-systemic risk for an account is represented as the posterior odds of a risk classification that can be decomposed into two distinct and independent parts: one depending on an age-dependent baseline for a cohort or vintage that may include disruptive time- or age-dependent variables, the other being a non-systemic component that describes risk of an individual relative to the baseline. In other words, we view non-systemic risk as an adjustment (increase or decrease) to the more critical systemic risk component where both factors are combined to predict total risk. An analogy from Physics would be that one could decompose the motion of stars in a large galaxy (portfolio or cohort of booked accounts) into the baseline motion of the galaxy as a whole plus the perturbations in motion of the stars (individual accounts) relative to the galaxy. In terms that are familiar to retail credit scoring experts, we separate the InfOdds from a possibly disruptive agedependent PopOdds for each cohort. The overall risk odds is the product of two independent factors, the first of which is the baseline which changes with time and age of the cohort whereas the second is the information odds (InfOdds) associated with the low-frequency or stationary performance risk of the individual relative to the baseline. This factorization makes it possible to combine traditional non-systemic individual risk scores with models of a systemic risk baseline that describe the cohort or portfolio as a whole. This way of thinking about default and prepayment predictions differs from models that assume both PopOdds and InfOdds components are combined within a single age- or time-dependent framework such as the proportional hazard rate model of Cox and Oakes (1984). In practice, one often finds a slow degradation of nonsystemic risk scorecards over time; a more severe and difficult problem arises when unforeseen economic events occur, when there are abrupt changes in the borrower’s ability to pay or where there are sudden disruptions caused by new products and acquisition policies. In such cases, risk assessment and prediction should include the possibility of unforeseen disruptive events as well as the impact of policy decisions made by lenders. Several statistical models have been proposed to recognize age-dependent effects on credit risk classifiers: the

simplest one is to replace a static classifier or non-systemic scorecard when the original is deemed to have undergone significant degradation. Crook et al (2004) study the effect of developing multiple default scorecards under different economic conditions. However, the act of building a new scorecard with new data when the original one is deemed to be ‘too old’, does not directly address the possibility of time- or age-dependent structural changes. Survival analysis models appear to be a natural way to formulate the systemic risk component although these predictions are inherently more difficult than non-systemic ones. One of the first applications of survival analysis to credit scoring was the paper by Narain (1992) who assumed that the density function of times to failures (Bads) was exponential. He was primarily concerned with statistical estimation when there is right-censoring of observed failures. A small population of credit card owners was examined with estimates based on a proportional hazards model developed by Cox (1972). Hand and Kelly (2001) point out that financial institutions continually make offers of new financial products and must quickly assess performance. In their analysis the authors also use a proportional hazards model with right-censored Bads. In a medical setting, Efron (2002) proposes an elegant two-way proportional hazards model that incorporates calendar or clock time as well as age or lifetime of individuals; he accomplishes this by assuming the hazard rate function is the product of independent calendar- and age-dependent factors. Thomas (2009) uses the traditional decomposition of PopOdds and InfOdds scores to convert point-in-time (PIT) estimates of default probabilities to average through-the cycle (TTC) estimates required by regulatory capital constraints. Bellotti and Crook (2009) incorporate macroeconomic variables such as interest rates, unemployment rates and house price indices in a proportional hazards model to make assessments of the influence of macroeconomic variables on the probability of account default over time. The macroeconomic variables are assembled as time-varying covariates that interact, post acquisition, with traditional behavioural characteristics. Whittaker et al (2007) update a scorecard over time by using a Kalman filter that monitors and extrapolates time-dependent weights of evidence in a borrower’s score. Breeden (2007, 2009) proposes a model structure in which expected default rate of a portfolio cohort is assumed proportional to the product of three (exponential) factors, one of which depends on vintage age, the second on clock time and the third on a risk quality measure for the particular vintage. Surprisingly, the identical mathematical structure is claimed for individual borrower account default probabilities and hazard rates, with individual risk scores substituting for the portfolio quality measure. Im et al (2012) offer a variant of a proportional hazards model that includes an additional time-scaling adjustment to the baseline hazard to provide better fits than is provided by a standard proportional hazards model. (As a warning to the reader, the phrase ‘baseline’ has several different meanings in most of these papers and differs from our own use in what follows). Pavlidis et al (2012) address the problem of

Robert M Oliver—Systemic/non-systemic scores

adapting credit risk classification to changes in population drift. They use an adaptive logistic regression model to estimate scores; parameters of the regression are updated with terms proportional to the gradient of a ‘discounted’ likelihood function. The results are similar to exponential smoothing procedures where parameters or estimates of unobservable states are updated with recent observations weighted more heavily than older ones. Most of the credit rating models on loan portfolios in the financial literature assume total default risk is the sum of a (‘causal’) systematic risk component that influences all accounts in the portfolio and an unsystematic risk that can be diversified by increasing the number of accounts in a loan pool. Heitfeld (2010) illustrates why the pooling of loan account assets may not mitigate systemic risk. As best we understand, there are no published accounts of credit rating models where a different mathematical structure holds for a disruptive nonstationary age- or time-dependent systemic process operating in parallel with an independent, stationary non-systemic risk process. In Figure 1 we show the acquisition (booking) and ageing of 12 loan accounts starting at calendar time t1 = 0 and age τ = 0. The term of all loans in the cohort is T, which is a number (25– 30 years for mortgages, 4–5 years for car loans, 1–2 years for equity loans) much larger than represented in the t-axis where account bookings may be separated by minutes and hours; because of the limitation of space for a graphical illustration, the reader should understand that the vertical height of the y-axis may be several orders of magnitude larger than the x-axis. In a discrete model of cohort accounts, bookings may occur at discrete time intervals but, in general, we depict them as occurring in continuous time. Accounts that prepay (P) do not default and accounts that default (D) do not prepay. D and P states represent terminal or trapped states and cannot return to become active accounts (A). At the end of term, active accounts are reclassified as Good (G) accounts. Increases in defaults and/ or prepays reduce the number of active accounts so that the latter are non-increasing over time. In some situations there are two transient states, one being an active status, the other a state in which the borrower is in arrears but may return to active status by paying all interest, penalty fees and balances owed; for

G

T

G

G

G

Cohort Age,

D P D

P

P 2

D P

P

1

0 t1 = 0

t5

t10

t10+

Calendar Time, t

Figure 1 Lexis diagram for active, prepay, default and good accounts.

3

purposes of this paper, we consider only one active or current state. In those situations where there are no prepays we use a binary label with active accounts consisting of either Goods (G) or defaults labelled as Bads (B). There are many reasons why the number of active and terminated accounts and cash flows in a cohort vary with time and the age of the portfolio. Defaults and prepayments (terminations) of active accounts occur for many reasons: loss of income, unemployment, moving to a new location, sudden changes in financial capacity of individual borrowers, the availability of new refinancing offers or disruptive economic conditions which force borrowers to default or prepay. Terminations always reduce the number of active (current) portfolio accounts. Unless one allows the addition of newly booked accounts to the cohort or portfolio (sampling with replacement), the number of active accounts is reduced to zero when the loan term expires. We can keep track of the counts of loans that are in one of three states: an active or current status (A), Default (D) or Prepay (P). At any point in calendar time any account that has not yet been identified as a D or P is labelled an A even though its eventual status may be D or P. The cumulative counts are: NA ðτ2 Þ ⩽ NA ðτ1 Þ; ND ðτ2 Þ ⩾ ND ðτ1 Þ; NP ðτ2 Þ ⩾ NP ðτ1 Þ all 0 ⩽ τ1 ⩽ τ2 < T:

ð1Þ

The sum over all states at every age equals the (constant) number of booked accounts of the cohort at time zero, N ¼ NA ð0Þ ¼ NA ðτÞ + ND ðτÞ + NP ðτÞ all 0 ⩽ τ < T:

ð2Þ

At each age (dashed horizontal line) in Figure 1 the count of defaults is the number of D’s below the horizontal line, the count of prepays is the number of P’s below the line and the counts of A’s is all the rest. For example, at age τ2, the count of actives is NA(τ2) = 9, the count of prepays is 2 and there is 1 default; accounts that have not defaulted or prepaid are counted as actives, A. Note that at any time prior to T, the active A accounts may contain unrevealed D and P accounts that will only become identified at a later time. At T, the accounts that have not prepaid or defaulted are labelled as Good, G. Section 2 of this paper describes the additive structure of the age-dependent systemic PopOdds score and the traditional non-systemic InfOdds score; we compare our model with the proportional hazards model (Cox, 1972) often used in the credit risk literature. (see Appendix A). In section 3, we develop the age-dependent distributions for unconditional baseline systemic risk and conditional data-dependent probabilities of default that includes both systemic and non-systemic risk components. In section 4 we show the relation between conditional hazard rates and conditional risk scores. This is followed, in section 5, by the inclusion of Prepays in the baseline and non-systemic risk calculations. In section 6 we analyse the 60-month performance of a cohort of Alt A 30 year mortgages originated in 2004 with ‘teaser rates’ that expired 24 months after acquisition. The combined risk effects clarify how

4


a disruptive, but predictable, subprime systemic risk trajectory can be created by non-linear interactions between defaults, prepayments and active accounts in a portfolio. Section 7 is a summary of findings and conclusions.

2. Combining information odds with population odds In the section that follows we temporarily exclude prepayments so that Bads correspond to defaults and active accounts at age τ < T are labelled as Goods, that is, accounts that have not yet defaulted. In theory, Goods should not be labelled as such until the end of term when all Bads have been revealed but we defer to standard credit scoring nomenclature. Initially, we specialize (1) and (2) so that we only consider active accounts and defaults where all loan accounts have a finite term T and, at age τ = Τ, the fractions of Goods and Bads are pG, pB with pG + pB = 1. We are not interested in infinite horizon operational situations where it is assumed all loan accounts eventually default. In general, cohorts can be defined in terms of segmentations of meaningful characteristics besides age or time on books; these might include the type of loan, the type of borrower, a characterization of the financial product or some combination of appropriate characteristics; we assume that one can keep track of cumulative numbers until end of term. The familiar Bayesian structure for the Good/Bad odds of an individual account within a booked cohort of accounts can be expressed as the product of two multiplicative factors, one of which, depends on behavioural, financial and demographic data, x. The well-known result for the Posterior Odds of a Good/Bad outcome is often written in the form, pðG j xÞ pðGÞ f ðx j GÞ ¼ ´ oðGjxÞ ¼ pðB j xÞ pðBÞ f ðx j BÞ ¼oPop ´ IðxÞ;

x 2 X;

ð3Þ

which states that the posterior odds, conditional on x, is the product of the prior Population Odds, oPop, times the Information Odds, I(x), or likelihood ratio for Goods and Bads. This leads to a log odds score that can be viewed as the sum of a constant PopOdds score plus a term known as the Information Odds score: sðxÞ ¼ ln oðGjxÞ ¼ ln oPop + ln ¼sPop + sInf ðxÞ:

f ðx j GÞ f ðx j BÞ ð4Þ

In practice, the x data is assembled and the (Good/Bad) performance outcomes are observed over a period of time that is similar in length to the forecast horizon where the scores are used to rank or predict future outcomes that lead to acquisition and pricing decisions. The InfOdds score is often found to be stable over time whereas the PopOdds may vary widely depending on the particular loan product, geographic location, the population in question and changing economic conditions. It is not uncommon to assume that economic and environmental conditions are stable enough so that historical

data collected over one or two years in the development and validation of scores can then be used to predict performance in an adjacent future window of one or two years. If, under the best of circumstances, PopOdds remains constant in future forecast horizons, as is often assumed during development and validation, then attention focuses exclusively on the quality, discriminatory and ranking powers of sInf(x). Because p(B|s(x)) depends on both PopOdd and InfOdd scores we emphasize that the presence of a superior InfOdds estimate in combination with an inferior PopOdds estimate is problematic for prediction and decision making. If one thinks of an underlying time- or age-dependent baseline risk for the entire cohort increased or decreased by the special circumstances of each individual within the cohort, the simplest model (and in our experience, the most insightful) is one where the PopOdds component captures the age-dependent systemic risk for a cohort of pooled accounts while the other component, Information Odds, captures the non-systemic risk of each individual relative to the cohort. An age-dependent version of the posterior odds in (3) for the ith individual within the cohort can then be written as oðGðτÞ j xi Þ ¼

PfGðτÞ j xi g PfBðτÞ j xi g

¼

PrfGood on or before τ j xi g PrfBad on or before τ j xi g

¼

PfGðτÞg f ðxi j GðτÞÞ ´ PfBðτÞg f ðxi j BðτÞÞ

¼oPop ðτÞ ´ Iðxi Þ; all τ

ð5Þ

where the posterior odds is factorable into two parts, one of which is age-dependent and a second, I(xi), is the familiar Information Odds based on financial, demographic and behavioural data associated with the ith individual. Each of the characteristics or attributes that define an individual in the vector xi can be a time-dependent process but we do not assume that time explicitly appears in the Information Odds in (5) as an independent ‘causal’ co-variate; we assume the information odds extracted in (3) or (4) is stationary over long periods of time when compared with possible high-frequency variations in the systemic risk component. With stationary Information Odds, the log odds score for the ith individual is separable and additive: sðτ j xi Þ ¼oðGðτÞ j xi Þ ¼ ln oPop ðτÞ + ln Iðxi Þ ¼sPop ðτÞ + sInf ðxi Þ:

ð6Þ

The similarity with (4) is obvious; the first term on the righthand side depends on cohort age, τ, possible seasonalities, disruptions in macro-economic variables or even financial policies that affect the baseline. The second factor is the traditional Information Odds risk score that adjusts the individual score relative to the baseline and depends on the individual borrower and loan product. Thus, the total score incorporates the effect of a dynamic baseline as well as perturbations


afforded by the individual’s risk performance relative to the baseline. The age-dependent probability of default is easily computed from the posterior odds: PB ðτ j xi Þ ¼

1 1 ¼ : sðτjx Þ s i Pop 1+e 1 + e ðτÞ esInf ðxi Þ

(7)

A referee of this paper has raised the question as to the timing of scorecard estimation, that is, before the acquisition or within the term of the loan. In the case of stationary non-systemic risk, the clock time for estimation is less critical but, as a practical matter, one should obtain these scores from characteristics that are relevant and reasonably close to acquisition or intended use. With systemic risk, however, one must carefully identify and synchronize age and clock time in the cohort being analysed.

3. Baseline PopOdds with age-dependent default hazards What happens to a cohort of Good and Bad loan accounts that is examined at different ages in its life, commonly known as a vintage analysis? At any point in time the cumulative count of Goods and Bads must add up to the total number of accounts, N, in the cohort. From booking to end of term, T, the count of Goods is non-increasing and of Bads is non-decreasing. As a special case of (1), NG ðτ2 Þ ⩽ NG ðτ1 Þ; τ2 ⩾ τ1

NB ðτ2 Þ ⩾ NB ðτ1 Þ

all 0 ⩽ τ ⩽ T;

ð8Þ

where NG(τ) + NB(τ) = N, τ ⩽ T and it is understood that at the end of term the fractional composition of Goods/Bads in the cohort is pG, pB. Let X be the instant of time at which default of a baseline account occurs. The default hazard rate, hB(τ), is the conditional probability of default in (τ, τ + dτ] given survival in (0, τ]; it is well known that the probability the account will be active at τ is Z 1 PA ðτÞ ¼ PG ðτÞ ¼ PrfX>τg ¼ pB ðuÞ du ¼ e - HB ðτÞ ; τ

Z

τ

HB ðτÞ≜

hB ðuÞ du:

ð9Þ

0

There is a finite probability of defaulting, Z 1 Prfdefault ever occursg ¼ pB ðuÞ du ¼ 1 - e - HB ð1Þ ; (10) 0

which leads us to consider two special cases: when the area under the default hazard rate is infinite, the cumulative hazard function grows without bound. In this unrealistic business case HB(∞) = ∞, the probability in (10) is one so that no active accounts end up being Good as all accounts in the cohort are guaranteed to default. The other case occurs when eventual default is uncertain and H(∞) =− ln pG. Thus, PB ð1Þ ¼1

when HB ð1Þis infinite;

¼pB ¼ 1 - e - HB ð1Þ when 0 ⩽ HB ð1Þ > > > > < n ¼ 1; 2; ¼ ; 23 > > 0:07e - 0:35ðn - 24Þ=12 ¼ 0:141ð0:971Þn > > > > : n ¼ 24; 25; ¼ ; 60

ð24Þ

These baseline hazard rates were used to represent cohorts of Alt A 2/28 mortgages after origination but before loan portfolios were securitized and sold to investors. Cumulative fractions of active and terminated accounts are graphed in Figure 2 as a function of cohort age, based on the discrete hazard rates in

1 (Active)

0.8

(Prepay)

P(•)( )

8

0.6 0.4 0.2 (Default) 0 0

10

20

30

40

50

60

, in months

Figure 2 Original age-dependent baseline of actives, prepays and defaults. (Alt A 2/28 600 FICO, 80 LTV Mortgage Cohort).

(24). Although a default or prepayment can be thought of as an event occurring in continuous time, most counts are reported at the end of discrete, monthly, time periods. Based on historical patterns for default and prepayment rates in (24) the cumulative hazard functions for defaults and prepays and the expected cumulative fraction of active, default and prepaid accounts are easily calculated from discrete versions of (9) (Appendix B.2). The initial FRM had a trigger date for change of loan rates on the 24th month. After the 24th month of ownership the mortgage had an ARM indexed to a specified number of basis points above LIBOR. Individual borrowers anticipated the change in loan terms so that both the prepayments and active accounts showed abrupt changes around the 24th month. Some financial institutions anticipated a possibly large increase in the prime interest rate in 2006–2007. Because it was believed that many of the Alt A loans were either second homes or investments anticipating home price increases, it was suspected there might be significant changes in the timing and flow of prepayments and defaults. To study this possibility one new scenario for a baseline assumed that only half of the homeowners who had previously prepaid and refinanced their mortgages with low teaser rates would be able to refinance. Among those unable to refinance, it was anticipated that half would continue as active accounts at high rates with the original ARM and stay in their homes, the other half would default on their loans either because they could not afford the higher ARM rates or felt that it was no longer worthwhile to pay the higher interest rates on a home whose value was uncertain and whose LTV might be greater than one (known as being under water). Thus, one of many ‘stress’ tests was to study the effects of new hazards, 1 1 h′P ðnÞ ¼ hP ðnÞ h′D ðnÞ ¼hD ðnÞ + hP ðnÞ 2 4 1 ¼0:002 + hP ðnÞ; 4

ð25Þ

on the dynamics of new (primed) baselines as well as the posterior hazard rates and overall risk of individual accounts. Conservation of cohort accounts had to be maintained throughout the calculations. The baselines resulting from the new


1

P(•)( )

0.8 0.6 0.4 0.2 0 0

10

20

30

40

50

60

, in months

Figure 3 New baseline of actives, prepays and defaults in proposed scenario. 0.15

0.05 0.04 hD ( | s Inf (x i ))

hazard rates in (25) are graphed in Figure 3. Under the assumed exchanges between prepayments and defaults there is (i) a significant decrease in the cumulative prepay baseline, (ii) an increase in the cumulative default baseline from borrowers unable to refinance their loans and, surprisingly, (iii) a significant increase in the baseline of active accounts. In studying the effects of systemic and non-systemic risk, many individual accounts were studied with particular interest in those individuals who had low Bureau or FICO scores, one of several risk measures used in the acquisition process. Figure 4 illustrates the conditional Good/Bad Information Odds score distributions for a typical development/validation sample of non-systemic risk that were used alongside different systemic risk scenarios for Alt A accounts. For those unfamiliar with the use of unscaled sInf(x) scores, it should be noted that the Good/ Bad conditional frequencies are often well-approximated by a normal distribution with values in the range (−4, +4). The figure illustrates the conditional score densities for a collection of 7169 2/28 Alt A loans (FICO scores between 600 and 640 and 80 LTV) that were obtained from default and prepay records of an unnamed but large financial institution. The development sample had 6343 G (Goods), 826 B (Bads), a PopOdds score of 2.07, a conditional score average of 2.530 for G, 1.619 for B, variance of 0.977 for G and 0.659 for B. The K-S statistic was 0.405, ROC value of 0.758 and a Divergence of 1.007. For example, an individual with an sInf(x) score of +0.693 has a

9

sInf (x i ) = 2

0.03 0.02

= -1 =0

0.01

= +1 0 0

10

20

30

40

50

60

, in months

Figure 5 Posterior default hazard rates from combined systemic and non-systemic risk.

Good/Bad odds twice that of the baseline individual; a score of −0.693 has half the odds of the baseline. With sInf(x) values of +1 and −1, the odds factors are 2.718 and 0.368, respectively. Originally, it was not obvious how changes in baseline prepayment hazards, such as those in (25), when combined with non-systemic risk scores, would affect posterior hazard rates and individual default probabilities. In Figure (5) we plot the conditional default hazard rates in (17) for four different sInf(x) values. The top curve illustrates how the conditional hazard rate for a borrower with a very low information odds score of sInf(x) =− 2 is distorted from the baseline hazard rate with sInf(x) = 0. The lowest default hazard curve is an individual with sInf(x) =+ 1. It is clear that total default risk for nonsystemic high-risk individuals is strongly affected by changes in baseline prepayment performance. It is not surprising that an increasing fraction of former prepays now default when refinancing opportunities decrease; what is surprising is that, in anticipation of uncertain refinancing, the non-linear dynamics of the reduction in baseline prepays yields disproportionately large default hazard rates for high-risk borrowers before the trigger date. For comparison purposes recall that the original hazard rate in (24) was 0.002. In the early life of the portfolio the first observed defaults come from borrowers with very low Information Odds scores and high PopOdds, but, as the cohort ages, they include additional unrevealed defaults from borrowers with larger Information Odds scores but very much lower PopOdds scores. The combined effects are complex, dynamic and not always obvious.

f(sInf (x)|B) 0.1 f(sInf (x)|G)

7. Conclusions 0.05

0 -3

-2

-1

0

1

2

3

4

sInf(x)

Figure 4 Conditional (DEV2012M4).

Good/Bad

sInf(x)

score

frequencies

This paper has proposed a proportional odds model for the incorporation of age-dependent systemic risk and non-systemic risk where posterior odds is factorable and log odds risk scores can be decomposed into two additive parts. The overall risk score includes an age-dependent baseline prior and a stationary component for the behaviour of individual accounts relative to the baseline. Bayes’ formula affords us a natural way to update the prior belief in the baseline with the non-systemic risk

10


associated with each individual. Although we have proposed and often used a proportional odds model, only well-designed experiments and the passage of time will tell us how they compare with other default, response and prepayment risk prediction models. Even though there is no one method that can reliably predict systemic risk from disruptive natural or man-made causes, it is our thesis that one should start with a prior prediction of a baseline that reflects beliefs and expert knowledge about the likely behaviour of new economic conditions, new product offerings, market disruptions or other non-stationary systemic events. Once that model or belief has been established one can adjust this prior and coherently combine it with the effects of well-tested and well-calibrated non-systemic risk patterns. In most lending applications that we are familiar with, the general practice is to obtain a non-systemic risk score and then undertake a sensitivity analysis using different assumptions about likely future changes in the population odds. Thus, the sensitivity to systemic risk is made after the non-systemic risk is established. In our opinion this practice should be reversed; every effort should first be made to assess and discriminate disruptive from steady-state effects, to identify the structure of systemic risk and to then introduce the non-systemic components. There are many ways in which risks interact but we must always adhere to wellknown conservation principles for baseline flows of defaults and prepayments from inventories of active accounts. In describing total default risk it is as helpful to be explicit about the mathematical structure of systemic risk associated with a baseline as it is in the non-systemic risk patterns of individuals. We can use time-dependent hazard rate models but there is no intent to suggest that (19), (21) or (24) is the correct expression for the age- or time-dependent baseline score or that hazard rates must be represented by particular fitting functions. For example, the cumulative hazard function for the baseline in (9) may depend on time-series of economic data y 2 Y whose structure has little or no connection with the data x 2 X used in traditional scorecards. Seasonalities, expert judgments about future changes in macroeconomic variables, market disruptions, terms and pricing of new financial loans or new acquisition policies may play a role in constructing a sensible mathematical and statistical baseline. The hybrid loan example should make it clear that one cannot assess default and prepayment risk by using a traditional nonsystemic default scorecard and a separate prepayment scorecard in isolation of one another. The number of defaults and prepayments in one period are dependent on the number of active accounts in a prior period; for this reason it is important to first describe the balance and flow equations that link systemic changes in default, prepayment and active accounts. In many credit risk portfolios there are two transient states: one an active, the other a delinquent state. Transitions from the latter may include return to the active state as well as default or prepayment. In such cases appropriate equations of motion for hazard rates, flows and baseline inventories need to be developed before non-systemic risk scores are incorporated in posterior risk assessments.

Although we have focused our attention on a model that combines systemic and non-systemic risk, we have not addressed many formidable statistical estimation issues that arise with censoring, unrevealed defaults and prepayments in mid-term scorecards as well as the need for informative reject inference. In our opinion it is important for both the conceptual models and the statistical estimation procedures to include prior beliefs about shape and timing of systemic risk components and then update these in light of new behavioural and financial patterns of the borrower as well as new pricing or loan products. We subscribe to the use of updating methods for systemic and non-systemic risk as we do to the use of prior judgments for systemic risk. Obviously, the information odds components contain valuable information on borrowers and features of new loan products that should be included; based on limited experience, we have come to believe that much more attention must be given to describing, assessing and quantifying systemic risk components in retail credit. Acknowledgements —The author extends his thanks to both referees for their helpful critiques and comments and to their suggestions for improving the clarity of exposition.

References Bellotti T and Crook JN (2009). Credit scoring with macroeconomic variables using survival analysis. Journal of the Operational Research Society 60(12): 1699–1707. Breeden JL (2007). Modeling data with multiple time dimensions. Computational Statistics & Data Analysis 51(9): 4761–4785. Breeden JL and Parker R (2009). Moving from rankings to ratings. Paper presented at Edinburgh Credit Scoring Conference XI. Cox DR (1972). Regression models and life-tables. Journal Royal Statistical Society, Series B 74(2): 187–220. Cox DR and Oakes D (1984). Analysis of Survival Data. Chapman and Hall: London. Crook JN, Thomas LC and Hamilton R (2004). The degradation of the scorecard over the business cycle. In: Thomas LC, Edelman DB and Crook JN (eds). Readings in Credit Scoring, Chapter 11. Oxford University Press: Oxford, England, pp 161–175. Efron B (2002). The two-way proportional hazards model. Journal Royal Statistical Society, Series B 64(4): 899–909. Ferguson CH (2012). Predator Nation. Crown Business Publishing: New York. Hand DJ and Kelly MG (2001). Lookahead scorecards for new fixed term credit products. Journal of the Operational Research Society 52(9): 989–996. Heitfeld E (2010). Lessons from the crisis in mortgage-backed structured securities: Where did credit ratings go wrong. In: Bocker K (ed). Rethinking Risk Measurement and Reporting, Vol. II, Risk Books: London: ISBN978-1-906348-50-2. Im JK, Apley DW, Qi C and Shan X (2012). A time-dependent proportional hazards survival model for credit risk analysis. Journal of the Operational Research Society 63(3): 306–321. Lo AW (2012). Reading about the financial crisis: A twentyone-book review. Journal of Economic Literature 50(1): 151–178. Narain B (1992). Survival analysis and the credit granting decision. In: Thomas LC, Crook JN and Edelman DB (eds). Credit Scoring and Credit Control. Clarendon Press: Oxford, pp 109–122.


Pavlidis NG, Tasoulis DK, Adams NM and Hand DJ (2012). Adaptive consumer credit classification. Journal of the Operational Research Society 63(12): 1645–1654. Thomas L (2009). Consumer Credit Models. Oxford University Press: New York. US Government (2011). The Financial Crisis Inquiry Report, Final Report of the National Commission on the Causes of the Financial and Economic Crisis in the United Sates, ISBN 978-0-16-0879830001, Superintendent of Documents, US Government Printing Office, Washington, DC 20402. Weibull W (1951). A statistical distribution function of wide applicability. Journal of Applied Mechanics—Transactions of the ASME 18(3): 293–297. Whittaker J, Whitehead C and Somers M (2007). A dynamic scorecard for monitoring baseline performance with application to tracking a mortgage portfolio. Journal of the Research Operational Society 58(7): 911–921.

Appendix A Default risk with proportional hazard rates (The Cox Model) The proportional hazards model of Cox (1972) was originally developed to describe the effects of biomedical responses. Variants of this model are used in credit risk scoring by Hand and Kelly (2001), Thomas (2009) and others. The hazard rate is expressed as the product of two factors, one being a timedependent baseline, h0(t), for the group as a whole, the other a scaling factor that contains information about the individual independent of age or time. In lending applications the scaling factor appears in the hazard rate for each individual. To simplify the discussion let us assume we are only interested in default risk, that is, without prepayments. By assumption the conditional default hazard rate for the ith individual is hB ðt j xi Þ ¼ h0 ðtÞξðxi Þ; ξ ⩾ 0; xi 2 X : (A.1) The x-dependent scaling factor describes the hazard rate of an individual relative to that of the time-dependent baseline. In the two-outcome case, the Cox model for the survival probability is therefore ξðxi Þ ; (A.2) PG ðt j xi Þ ¼ e - HB ðtjxi Þ ¼ e - H0 ðtÞξðxi Þ ¼ e - H0 ðtÞ where HB(t) corresponds to the time-dependent cumulative hazard function. This equation compares with the equation for survival probability in (13) where the multiplicative factor for age-dependence applies to the information odds, not the instantaneous hazard rate. When we consider two different individuals with the same t value, the ratio of the probabilities of remaining an active account contains an exponential term proportional to the difference in the scaling factors, independent of time or age. When there is no prepayment it follows from (A.2) that the Good/Bad log odds risk score is given by PG ðt j xi Þ (A.3) ¼ - lnðeH0 ðtÞξðxi Þ - 1Þ; sðt j xi Þ ¼ ln PB ðt j xi Þ which is not factorable into systemic and non-systemic components. In statistical examination of credit risk data most authors

11

assume that the data-dependent scaling factor in (A.2) is itself exponential, that is ξðxi Þ ¼ expfdT xi g: The vector x represents risk characteristics with d a vector of coefficients (weights) estimated from the data. Thus the probability of the ith account being active at time t is a special case of (A.2) with the time-dependent score sðt j xi Þ ¼ - lnðeH0 ðtÞ expfd

T

xi g

- 1Þ:

(A.4)

Neither the scaling factor in (A.1) nor the exponent dTx should be confused with the log odds score in (A.3). Although the meaning is inconsistent with our usage, this factor has been described as a hazard score by Thomas (2009). Efron (2002) generalizes (A.1) so that both calendar time (t) and age or time on books (τ) are incorporated in a two-way proportional hazards model: hB ðt; τ j xi Þ ¼ h0 ðtÞg0 ðτÞξðxi Þ; ξ ⩾ 0; xi 2 χ

(A.5)

Although he does not explicitly concern himself with credit risk scores, he obtains numerous results for discrete survival distributions in GLM models and discusses estimation under different structural assumptions for calendar time hazard h0(t) and age hazard g0(τ).

Appendix B Default and prepay terminations in discrete time In a discrete-time formulation that uses difference equations, the conditional default and prepayment hazard rates in each period (days, weeks, months) are denoted by hD ¼ðhD ð1Þ; hD ð2Þ; ¼ ; hD ðnÞ; ¼ ; hD ðKÞÞ and hP ¼ðhP ð1Þ; hP ð2Þ; ¼ ; hP ðnÞ; ¼ ; hP ðKÞÞ

ðB:1Þ

with the termination hazard rate vector hT = hD + hP. From the termination rates we can obtain the (survival) probability an account will remain active as a vector PA = (PA(1), PA(2), …, PA(n), …, PA(K)) with elements of the vector a non-increasing sequence defined in terms of the discrete cumulative termination function. n X PA ðnÞ ¼ ¼e - HT ðnÞ HT ðnÞ≜ lnð1 - hT ðkÞÞ k¼1

n ¼1; 2; ¼ K:

ðB:2Þ

The cumulative default and prepay probabilities are vectors: PD = (PD(1), PD(2), …, PD(n), …, PD(K)) and PP = (PP(1), PP(2), …, PP(n), …, PP(K)). Conservation requires that the cumulative fractions add to one or that net changes add to zero, PD ðnÞ + PP ðnÞ + PA ðnÞ ¼ 1 or pD ðnÞ + pP ðnÞ + PA ðnÞ - PA ðn - 1Þ ¼ 0; n ¼ 1; 2; ¼ K:

12


With an assumed vector of prepay and default hazard rates one can recursively compute the cumulative fraction of defaults, prepays and active accounts so that the age-dependent systemic active/terminated (AT) and prepay/no prepay (PN) scores are given by: PA ðnÞ sAT Pop ðnÞ ¼ ln PD ðnÞ + PP ðnÞ e - HT ðnÞ - HT ðk - 1Þ k¼1 hT ðkÞe

¼ ln Pn

PP ðnÞ PA ðnÞ + PD ðnÞ Pn hP ðkÞe - HT ðk - 1Þ ¼ ln - H ðnÞ k¼1Pn : e T + k¼1 hD ðkÞe - HT ðk - 1Þ

sPN Pop ðnÞ ¼ ln

ðB:3Þ

It is a straightforward calculation to obtain the conditional age-dependent total score for each individual and, by inversion, the desired probabilities.

Received 14 August 2012; accepted 04 September 2013 after one revision