Stable Modeling of Operational Risk - CiteSeerX

8 downloads 67 Views 597KB Size Report
Barings (Singapore, 1995), Daiwa (Japan, 1995), among many others1. ... Under the 2001 Basel II Capital Accord [3] banks are a subject to the capital charge.
Stable Modeling of Operational Risk Anna Chernobai∗ Department of Statistics and Applied Probability University of California, Santa Barbara, CA 93106, USA Svetlozar Rachev† Department of Econometrics and Statistics University of Karlsruhe, D-76128 Karlsruhe, Germany, Department of Statistics and Applied Probability University of California, Santa Barbara, CA 93106, USA

1

Introduction

In the course of the last few decades, the financial industry has been characterized by accelerated globalization, deregulation and a boost of new products, instruments and services. An inevitable outcome has been an elevated exposure of the financial institutions to various sources of risk. A large number of losses borne by financial institutions are attributed to neither market nor credit risks. In fact, it is the large magnitude losses that rarely occur (i.e. low frequency, high severity losses), that cause most serious danger: recall the dramatic consequences of unauthorized trading, fraud, human errors on the Orange County (USA, 1994), Barings (Singapore, 1995), Daiwa (Japan, 1995), among many others1 . Financial institutions are faced with three types of risks: (1) market risk - risk of movements in market indicators, such as S&P 500, exchange rates, interest rates, (2) credit risk - counterparty risk, and (3) ’other’ risks. Operational risk, until recently perceived as a mere part of the third category, has been recognized as a major financial risk component. The Basel II Capital Accord of 2001 defined operational risk as ”the risk of loss resulting from inadequate or failed internal processes, people and systems or from external events” [3],[4]. Banks are allocating roughly 60% of their regulatory capital to credit risks, 15% to market risks and 25% to operational risk 2 [12]. Under the 2001 Basel II Capital Accord [3] banks are a subject to the capital charge for the unexpected operational losses. 3 BIS suggested the following five methodologies: the Basic Indicator Approach, the Standardized Approach, the Internal Measurement Approach, the Scorecard Approach, and the Loss Distribution Approach. Banks are required to adopt one by the end of the year 2006. The Loss Distribution Approach (hereforth, LDA) is the most accurate from the statistical point of view as it utilizes the exact distribution of the losses - both frequency and severity - based on each individual bank’s internal loss data. Under the LDA, the operational risk capital charge is determined by the Value-at-Risk measure (hereforth, VaR) 4 , summed across all ’business line/ event type’ combinations. The capital charge is hence dependent on the nature of the frequency and severity distributions. In particular, under the VaR measure, the tails of these distributions become the key determinants of the amount of the capital charge. BIS suggests using ∗ Electronic

mail: [email protected]. mail: [email protected]. S. Rachev’s research was supported by grants from Division of Mathematical, Life and Physical Sciences, College of Letters and Science, University of California, Santa Barbara, and the Deutschen Forschungsgemeinschaft. 1 Additionally, risks of natural disasters, significant computer failures, terrorist attacks are other such examples of non-market risk, non-credit risk related events (for example, the financial consequences of the September 11, 2001 attack). 2 Crouhy [6] states 70%, 10% and 20% figures, respectively, for credit, market and operational risk capital allocation. 3 Expected operational losses require capital provisions. 4 The concept of Value-at-Risk will be reviewed in Section (6). † Electronic

the Lognormal distribution for the loss severity and the Poisson distribution for the loss frequency. Under this assumption, the inter-arrival times follow an Exponential distribution, which has light (exponentially decaying) tails, and the Lognormal distribution for the loss amounts suggests that the probability of very high magnitude losses is negligible (as suggested by the relatively light tails). We strongly believe that with the compound Poisson model the true capital charge would be seriously underestimated. The losses that are very rare but of high magnitude can potentially cause severe damage to the institution (see examples above and footnote 1), and thus should not be neglected. Thin-tailed distributions assign a zero probability to such rare events. In general, the thickness of the tails is determined by the asymptotic tail probabilities. A few examples of tail behavior (thinner tails first) include [8]: 2

Normal-type tails (Normal)

F (x) ∼ e−x

Exponential-type tails (Exponential, Gamma)

F (x) ∼ e−x

Lognormal-type tails (Lognormal, Benktander-type-I)

F (x) ∼ e− ln

Weibull-type tails (heavy-tailed Weibull, Benktander-type-II)

F (x) ∼ e−x , γ ∈ (0, 1)

Power-type tails (Pareto, LogGamma, Burr, α-Stable)

F (x) ∼ x−α (α ∈ (0, 2) for α-Stable)

2

x

γ

In general, for various operational risk event types and for various business lines (and various banks) loss magnitudes and frequencies can be light-tailed or heavy-tailed. One can identify four broad categories of stochastic processes for operational risk (in this paper we study the last case): 1. Light-tailed inter-arrival times and light-tailed loss amounts, 2. Light-tailed inter-arrival times and heavy-tailed loss amounts, 3. Heavy-tailed inter-arrival times and light-tailed loss amounts, 4. Heavy-tailed inter-arrival times and heavy-tailed loss amounts. In this paper, we provide empirical evidence that supports using the heavy-tailed distributions. In particular, the α-Stable distributions (also called Stable Paretian distributions) have shown to provide an excellent fit for both inter-arrival times and loss magnitudes. α-Stable distributions possess several attractive features that can be useful for the practitioners in the risk management area. One feature is the existence of the domain of attraction (this is the Stable-case ’counterpart’ to the Central Limit Theorem for a sum of random variables with finite second moment): a sum of appropriately standardized α-Stable random variables equals in distribution again an α-Stable random variable with identical stability index (hence, the name ’Stable’). Another key feature is the power-like tail decay: it allows to capture the data that lies in the tails of the distribution. Hence, α-Stable distributions give a non-zero probability to high magnitude losses. The Stable distributions also account for skewness, which is impossible with, for example, Lognormal distribution (indeed, the mean value of losses exceeds the median in the BIS data collection exercise results [5], suggesting right-skewness.). One main barrier that holds α-Stable distributions from being extensively used in practice is the fact that the closed form densities do not exist (with three exceptions). Hence, Maximum Likelihood-based estimation (MLE) of the parameters becomes complicated. This difficulty can be overcome: this paper’s estimations are based on numerical approximation of the density with the fast Fourier transformation (FFT) of the characteristic function, that allows to perform the MLE, and the goodness-of-fit results are comparable with other distributions (See [18], pp. 120-125). This paper is organized in a following way. In Section 2 we give an overview of sub-Gaussian α-Stable distributions where we discuss their essential properties that are of interest in practice. In Section 3 we briefly look at the differences between the models used for modeling market and credit risks, and determine the nature of models that ought to be used for the operational risk modeling. In the subsequent section, Section 4, we discuss four different compound models that can be used for the purpose of the operational

risk analysis. The first model is the compound Poisson process which was proposed by the Basel Committee. We argue that such model is overly restrictive. It requires a constant intensity of the loss events, and it implies a thin-tailed (Exponential) distribution of the time intervals between the events. We discuss the alternatives: compound Cox process and compound Mixed Poisson processes. Some properties of a special case, a process in which both inter-arrival times and the loss amounts are heavy-tailed, is reviewed in Section 4.4. Section 5 provides the empirical evidence from a European bank’s operational loss data in favor of using the heavy-tailed α-Stable distributions for both frequency and the severity distributions of the operational risk models. In contrast, the thin-tailed distributions show a fairly poor fit. This has a serious implication for the estimation of the capital charge, as discussed in Section 6. In the same section, we also argue in favor of using an alternative to the VaR methodology for the estimation of the capital charge, namely the Expected Tail Loss. Using the Expected Tail Loss (ETL) instead, is in line with our discussion on using the α-Stable distributions. Finally, Section 7 gives concluding remarks and outlines the ideas for future research.

2

α-Stable Distributions

This section gives a brief overview of α-Stable random variables and some essential properties. Definition 1 Stable random variable. A random variable X has a Stable distribution if for any n ≥ 2, there exist numbers Cn > 0 and Dn ∈ 0 θ=0 θ 0 µ ¶ Sλ − a(λ) P ≤ x → Φ(x), λ → ∞ (12) b(λ) iff E(X12 ) exists. The functions a(λ) and b(λ) are the mean and the standard deviation of SN , respectively. Φ(x) denotes the cumulative distribution of a standard Normal variable. Compound Poisson process with Lognormal distribution of the generating process appears to be overly restrictive in applying to the operational risk modeling. In the following sections, we study three broad groups of stochastic models for operational risk (the first two are the generalizations of the compound Poisson processes, and the third is a special case): 6 See

for example, [23], [13]. Leipnik [13] showed it to be of the form: Ã r πi )2 ∞ X 2 logt + µ + ` π − (logt+µ+ ` 2 −2 2 2σ √ e (−1) d` (2σ ) ϕX (t) = H` 2σ 2 σ 2 `=0 P γd + n

(−1)k ζ(k+1)d

πi 2

!

n−k k=1 where d0 = 1, dn+1 = n , n ≥ 0, in which γ is the Euler’s constant and ζ(r) = n+1 function, and H` are the Hermite polynomials.

P∞

n=1

n−r is the zeta

I Compound Mixed Poisson process. The intensity of the occurrences of loss events has a random nature, independent of time. The increments of the process are stationary, so the compound process is homogeneous. II Compound Cox process. The intensity of the occurrences of loss events has a random nature, and the time parameter t is incorporated into the intensity. We refer to the asymptotic properties of such process as time-asymptotic as they take place when t → ∞. The increments of the process are non-stationary, so the compound process is inhomogeneous.7 III Compound models with Stable inter-arrival times and Stable loss distributions. In this process, studied in [14],[15],[16], the inter-arrival times and loss magnitudes both follow a heavy-tailed distribution. The limiting cases are both N-asymptotic and time-asymptotic, and the limiting properties vary depending on which, N or t, approaches infinity first. The limiting process is a self-similar process.

4.2

Compound Mixed Poisson Processes

The intensity of the arrival of loss events may not be constant - it can be random (but independent of time). Definition 3 Mixed Poisson Process.[2] Let the random variable Λ and the standard Poisson process N1 be independent. The point process N = N1 ◦ Λ, where N1 ◦ Λ(t) = N1 (Λt), is called a mixed Poisson process. The compound process is the compound mixed Poisson process. The probability mass function of the counting process is now dependent on various values of the stochastic intensity: Z ∞ (tλ(t))k P (N (t) = k) = fΛ (λ)dλ e−tλ(t) k! 0 · ¸ (tλ(t))k −tλ(t) = E e (13) k! The cumulative distribution function of the compound loss amount is calculated as: ∞ k X ¡ ¢ ¡X ¢ P N (t) = k P Xj ≤ x

P (SN ≤ x) =

(14)

j=0

k=0

Inter-arrival times. The inter-arrival times follow an Exponential distribution with a stochastic parameter tΛ(t). The density, thus, is: Z ∞ fY (y) = tλ(t)e−ytλ(t) fΛ (λ)dλ 0 h i = E tΛ(t)e−ytΛ(t) . (15) Characteristic function. The characteristic function of the cumulative loss process is: µZ ∞ ¶ eiux fX (x)dx ϕSN (t) (u) = GN (t) =

∞ µZ X k=0

0

0 ∞

¶k Z e

iux



fX (x)dx 0

e−tλ(t)

(tλ(t))k fΛ (λ)dλ k!

(16)

7 In literature, both Cox processes and mixed Poisson processes are often called Cox processes. Both being doubly-stochastic Poisson processes, they nevertheless possess quite different properties in terms of dependence on the time variable. The Cox processes (in the way defined in this paper) are inhomogeneous because their increments are non-stationary; the mixed Poisson processes are homogeneous because their increments are stationary. See for example, Remark 5.1. in [11], and a brief discussion in [2] §6.2. on the differences.

Mean and variance. The mean is linearly proportional to t as in the fixed intensity case. The variance of the counting process is proportional to t2 . Both are easily derived to be: EN (t) = ¡ ¢ var N (t) =

t EΛ(t) t2 varΛ(t) + t EΛ(t)

(17) (18)

The mean and the variance of the compound mixed Poisson process are proportional to t and t2 , respectively, and satisfy: ESN (t) = = ¡ ¢ var SN (t) =

EN (t) EX t EΛ(t) µX var(N (t))(EX)2 + EN (t)varX ¡ ¢ 2 µ2X t tvarΛ(t) + EΛ(t) + σX tEΛ(t)

=

(19) (20)

It is clear that the first moment exists only if the first moments of the loss magnitude and the intensity processes are finite; the second moment exists, similarly, only if the second moments of the loss magnitude and the intensity processes are finite. It is worth emphasizing again that, unlike the mean and the variance of the compound Poisson process in which both mean and the variance are proportional to time in a linear fashion, in the mixed Poisson process these are no longer linearly proportional to t. Example 2 Suppose, Λ(t) ∼ N (ν, s2 ). If the loss amounts follow Lognormal distribution with parameters µ and σ, then the mean and the variance of the cumulative loss process become: ESN (t) = ¡ ¢ var SN (t) =

tνeµ+

σ2 2 2

t2 s2 e2µ+σ + tνe2(µ+σ

2

)

Comparing with the example in Section (4.1), this suggests a higher variance. Next, we reproduce some important tail properties. We begin with a definition. Definition 4 Slowly Varying Function. A function L is slowly varying at infinity, denoted L ∈ SV , if L(yx) ∼ L(x), L(yx) i.e. if lim = 1, x→∞ L(x) for all y > 0. Definition 5 Regularly Varying Function. A function F is regularly varying at infinity with exponent γ, denoted F ∈ RVγ , if F (x) = xγ L(x) for some slowly varying function L. Tail Properties.[11], [24] Suppose that the FΛ (λ) is absolutely continuous with density fΛ (λ) such that fΛ (λ) ∼ L(λ)λγ−1 e−βλ , λ → ∞, for β ≥ 0 and −∞ < γ < ∞, and γ < 0 if β = 0, implying the tail of the form  L(λ)λγ  if β = 0  −γ F Λ (λ) ∼ , λ → ∞.   L(λ)λγ−1 e−βλ if β > 0 β

(21)

(22)

Then the tail behavior of the counting process is P (N (t) > n) ∼

L(n)t ¡ t ¢n γ−1 n , n → ∞, (β + t)γ β + t

(23)

If β = 0 and γ < 0 in (21), then this refers to the heavy-tailed (power-like tail decay) case. If γ = 1, then this indicates the thin-tailed (exponential tail decay) case. If the severity distribution is thin-tailed and non-lattice, then the he tail behavior of the compound mixed Poisson process is : F S(t) (x) ∼

L(x) xγ−1 e−kx , x → ∞, (h0X (k)t)γ k

(24)

R∞ where hX (k) is defined as hX (k) = 0− ekx dF (x) and k > 0 satisfies ϕ(k) − 1 = β/t. If the severity distribution is heavy-tailed and the inter-arrival times distribution is thin-tailed, then the compound loss process is heavy-tailed. If both severity and frequency distributions are heavy-tailed, i.e. F Λ (λ) ∼ ωL(λ)λγ , ω ≥ 0 and λ → ∞, F X (x) ∼ χL(x)xγ , χ ≥ 0 and x → ∞,

(25)

for some γ < −1, then: ¡ ¢ F S(t) (x) ∼ ω(µX t)−γ + χµΛ t L(x)xγ , x → ∞,

(26)

in other words, the compound loss process is also heavy-tailed. These tail properties state that the tail properties of both, loss severity and loss frequency, distributions contribute fairly equally to the tail behavior of the aggregated loss process. If at least one of these is heavy-tailed, then so is the compound process. Compound mixed Poisson processes have been used in risk models and insurance mathematics to model the claim amounts. References include [2], [8], [11].

4.3

Compound Cox Processes

This time the intensity factor is stochastic and depends on time. Definition 6 Random Measure. [2]8 A random process {Λ(t), t ≥ 0} with non-decreasing sample paths, right-continuous, satisfying Λ(0) = 0, P (Λ(t) < ∞) = 1, 0 < t < ∞, is called a random measure. For the purpose of Cox processes, it is also required for Λ(t) to be real-valued and defined on the non-negative half-line. In applied problems, it is often for convenience expressed in the form: Z t Λ(t) = λ(τ )dτ, t > 0, 0

where λ(τ ) is a non-negative function. The function λ(τ ) is sometimes called the instantaneous intensity of the process N (t), and Λ(t) is referred to as the cumulative intensity. Let N (t) be the counting process for the loss events following a Poisson distribution, and Λ(t) be a random measure representing the controlling process of the arrival of loss events. Then the counting process can be expressed as N (t) = N1 ◦ Λ(t) = N1 (Λ(t)), where N1 (t) is the standard Poisson process with unit intensity. Definition 7 Inhomogeneous Poisson Process. A point process {N (t), t ≥ 0} is called an inhomogeneous Poisson process (Cox process) with intensity measure {Λ(t), t ≥ 0} if the following holds: (i) N (t) is a process with independent increments; (ii) N (t) is inhomogeneous: for any s, t ≥ 0, the increments N (t) − N (s) have a Poisson distribution with parameter Λ(t) − Λ(s). 8A

precise definition of a random measure can be found in [20]. See also [10].

Definition 8 Cox Process. The random process N (t) = N1 (Λ(t)) in which Λ(t) is independent of N1 (t) is called a Cox process (or a doubly stochastic Poisson process). It is said that the Cox process N (t) is controlled by the process Λ(t) or that the process Λ(t) controls the Cox process N (t). The probability mass function of the counting process is now dependent on various values of the stochastic intensity: Z ∞ (λ(t))k P (N (t) = k) = e−λ(t) fΛ (λ)dλ k! 0 ¸ · (λ(t))k −λ(t) e (27) = E k! The cumulative distribution function of the compound loss amount is calculated as: P (SN ≤ x) =

∞ k X ¡ ¢ ¡X ¢ P N (t) = k P Xj ≤ x

(28)

j=0

k=0

Inter-arrival times. The inter-arrival times follow an Exponential distribution with a stochastic parameter Λ(t). The density, thus, is: Z ∞ λ(t)e−yλ(t) fΛ (λ)dλ fY (y) = 0 h i = E Λ(t)e−yΛ(t) . (29) Characteristic function. The characteristic function of the Cox process with stochastic intensity parameter is: Z ∞ iu iu ϕN (t) (u) = eλ(e −1) fΛ (λ)dλ = EeΛ(t)(e −1) (30) 0

which is the Moment Generating function of the controlling process Λ(t) around (eiu − 1). The characteristic function of the cumulative loss process is: µZ ∞ ¶ iux ϕSN (t) (u) = GN (t) e fX (x)dx =

0 ∞

∞ µZ X k=0

¶k Z eiux fX (x)dx

0



e−λ(t)

0

(λ(t))k fΛ (λ)dλ k!

(31)

where G(s) is the Probability Generating function around point s. Mean and variance. The mean and the variance of the Cox counting process are dependent on t through its dependence of the controlling process. EN (t) = ¡ ¢ var N (t) =

EΛ(t) varΛ(t) + EΛ(t)

(32) (33)

The mean and the variance of the compound Cox process then satisfy: ESN (t) ¡ ¢ var SN (t)

= EN (t) EX = EΛ(t) µX

(34) 2

= var(N (t))(EX) + EN (t)varX ¡ ¢ 2 = µ2X varΛ(t) + EΛ(t) + σX EΛ(t)

(35)

It is clear that the first moment exists only if the first moments of the loss magnitude and the intensity processes are finite; the second moment exists, similarly, only if the second moments of the loss magnitude and the intensity processes are finite.

Example 3 Suppose, Λ(t) ∼ N (ν, s2 ) in which the parameters are dependent on t. If the loss amounts follow Lognormal distribution with parameters µ and σ, then the mean and the variance of the cumulative loss process become: ESN (t) = ¡ ¢ var SN (t) =

νeµ+

σ2 2 2

s2 e2µ+σ + νe2(µ+σ

2

)

The asymptotic properties are related to time, so we differentiate them from others by naming them time-asymptotic as they coincide with the asymptotic properties for fixed N in a fixed interval, when the time t goes to infinity, i.e. when a longer time interval accounts for an increased intensity of events. P Asymptotic Properties. [2] Λ(t) → ∞ as t → ∞. Hence, asymptotic behavior of SN (t) as λ(t) → ∞ coincides with the asymptotic behavior of SN (t) as t → ∞. P Let Λ(t) → as t → ∞, and µX 6= 0. Let D(t) > 0 be an infinitely increasing function. Then a normalized Cox process S(t) is asymptotically Normal with some asymptotic variance δ 2 : ´ ³ S(t) − C(t) P ≤ x → Φ(x/δ), t → ∞ (36) D(t) iff |µX |δ 2 |C(t)| ≤ 2 2 , 2 t→+∞ D (t) µX + σX Ã ! p ³ µ Λ(t) − C(t) ´ ¢ ¡ |µX |D(t)x X lim L1 P ≤x , Φ p = 0. 2 )|C(t)| t→+∞ D(t) |µX |δ 2 D2 (t) − (µ2X + σX lim

where L1 is the L´evy distance between two distribution functions used to denote the weak convergence. This property says that the standardized compound Cox process is asymptotically Normal if and only if so is its controlling process. Compound Cox processes have been used in science to model inhomogeneous chaotic flows of events. Examples from insurance and finance include applying Cox processes with zero mean to modeling evolution of stock prices, and with non-zero mean to non-life insurance amount (see, for example, [2], [10]).

4.4

Compound Models with Stable Inter-Arrival Times and Stable Loss Distributions

This section considers a renewal process in which both inter-arrival times and loss amounts follow a Stable distribution. Such process can be thought as a special case of Compound Cox process (Section 4.3) or Compound Mixed Poisson process (Section 4.2). We follow the notations and results of [15], [14]. Let {Uk }k≥1 be an i.i.d. sequence of inter-arrival times, positive and integer-valued, with the distribution in the Stable domain of attraction with the stability index 1 < α < 2 (ensuring existence of a finite mean). Let {Wk }k≥1 form a real-valued i.i.d. sequence of loss amounts, independent of {Uk }k≥1 , belonging to the Stable domain of attraction with the index 0 < β < 2. Then the ’renewal-reward’ process W (t) is constructed by associating a constant ’reward’ (in our case, loss) amount to every inter-renewal interval. Suppose we have M i.i.d. copies of W (t) (for example, we have M operational risk factors), then the cumulative loss process (up to time [Ty]) can be defined by W ∗ (T y, M ) =

[T y] M XX

Wm (t),

(37)

t=1 m=1

where 0 ≤ y ≤ 1, t = 0, 1, ..., T, m = 1, 2, ..., M . L´evy and Taqqu show that the limiting behavior of such compound process depends on which of the M or T approaches infinity first. In either case, the limit is a self-similar process. Definition 9 A process X(t), − ∞ < t < ∞ is self-similar with parameter H (H-SS), if d

X(at) = aH X(t), ∀ a > 0.

(38)

One can distinguish between three cases of the asymptotic behavior: I α < β, M → ∞ and then T → ∞. The loss amounts are more heavy-tailed than the inter-arrival times. The limit process is β-Stable and a self-similar process with H = β−α+1 . The process has β stationary dependent increments.9 II β ≤ α, M → ∞ and then T → ∞. The inter-arrival times are at least as heavy-tailed as the loss amounts. The limit process is β-Stable L´evy motion. It is a self-similar process with H = 1/β. The process has stationary independent increments. III β ≤ α or β > α, T → ∞ and then M → ∞. The limit process is the Stable L´evy motion, a Stable process with the tail index ρ = min{α, β}. It is a self-similar process with H = 1/ρ. It has stationary independent increments. In all cases, the limiting process is a self-similar process. Non-Gaussian self-similar processes are said to exhibit long-range dependence. If the loss magnitudes are more heavy-tailed than the inter-arrival times, then the limiting self-similar process has dependent increments.

5 5.1

Empirical Analysis of Operational Loss Data Introduction

We obtained the operational loss data from a European bank10 over the period 1950-2002. The loss severity data is measured in USD and the loss frequency data is represented as the number of days between consecutive dates. Only events whose state of affairs was ’closed’ or ’assumed closed’ were considered for the analysis. The total number of data points used for the analysis was 1727 for the frequency data, and 1802 for the severity. The losses were classified into six types: ’Relationship’, ’Human’, ’Legal’, ’Processes’, ’Technology’, and ’External’. The magnitudes of losses were adjusted for inflation. The summary of the data availability is illustrated in Table 1.11 Histograms of the data sets showed significant kurtosis and right-skewness. Spikes on the right tail of the severity distribution indicate significance of high-magnitude losses. Event type Relationship Human Legal Processes Technology External Total

Severity 585 647 75 214 61 220 1802

Frequency 585 647 214 61 220 1727

Table 1: Number of data points per event type. To test the goodness of fit of various distributions to the data, we used the Kolmogorov-Smirnov (KS) and the Anderson-Darling (AD) statistics. The former one is used to capture the deviations of the fitted distribution from the empirical around the median of the fitted distribution, and the latter one allows for the appropriate weighting of the discrepancies in the tails of the fitted distribution. The lower values of the 9 For β = 2 (inter-arrival times are Gaussian), the limit is the fractional Brownian motion, self-similar with H = process with stationary and dependent increments. 10 The name of the bank is not disclosed. 11 The frequency data on the type ’Legal’ was missing from our data set.

3−α , 2

a

statistics indicate better fit. We use the statistics of the following form: Kolmogorov-Smirnov statistic

KS = max |Fˆ (x) − Fn (x)|

Anderson-Darling statistic

|Fˆ (x) − Fn (x)| AD = max q x Fˆ (x)(1 − Fˆ (x))

x

The majority of the frequency data points were recorded as ’dd/mm/yy’ but some were recorded as year ’yy’ only. The second type of the data can be considered as censored. The following table shows the fractions of the censored data entries for each loss event type in our data set. Event type Relationship Human Processes Technology External Total

Fraction of censored data 31/585 50/647 15/214 30/61 60/220 186/1727

(%) (5.3%) (7.7%) (7%) (49%) (27%) (10.77%)

Table 2: Fractions of censored frequency data entries per event type. The problem is that these censored data points can not be omitted from the analysis of the frequency data because it would result in a loss of efficiency of the estimations, and the Maximum Likelihood estimators would be inconsistent. To account for the censored data we suggest using a ’refined’ Likelihood function of a following form: J ¡ ¢u Y ¡ ¢(n−u)j L(x; θ) = f (x; θ) F(c,d)j (x; θ) , j=1

where n is the total number of data points, u and (n − u) are the number of uncensored and censored data, respectively, and F(c,d)j is the cumulative distribution function evaluated between c =January 1, and d =December 31, of year j = 1, 2, ..., J. θ is the set of parameters. All data points are assumed to follow the same distribution. Then the log-Likelihood function becomes: J ³ ¢ X `(x; θ) = u log f (x; θ) + (n − u)j log

¡

j=1

Z

dj

´ f (x; θ)dx

cj

The Maximum Likelihood estimates can be then obtained in the usual way.

5.2

Goodness of Fit Tests

We fitted a number of distributions to the severity and frequency (uncensored) data. Parameters were estimated with the Maximum Likelihood procedure in the Matlab environment. For the analysis, the distributions were divided into two broad parts: first, one-sided distributions for which the data sets in the original form were used, and second, two-sided distributions for which the data sets were symmetrized. The KS and AD statistics are summarized in the tables in the following tables. The numbers in bold indicate the lowest value. Table 3 demonstrates that heavy-tailed distributions in general show much better fit (in the majority of cases) for modeling the loss severities than the suggested Lognormal distribution. This is observed from the lower KS statistic values. Figures 2 and 3 show the QQ-plots (Quantile-Quantile plots) for the empirical distributions of the six loss types against the fitted LogStable and symmetric Stable. Dots lying around the straight line indicates a good fit.

Relationship

Human

Legal

Processes

Technology

External

Lognormal logStable

0.0284 0.0354

One-sided distributions 0.0503 0.1505 0.0494 0.1521

0.0574 0.0482

0.1146 0.0906

0.0449 0.0444

Student t Cauchy Stable Normal

0.0624 0.0624 0.0179 0.3141

Two-sided distributions 0.0624 0.5000 0.0294 0.5000 0.0116 0.1003 0.2946 0.1299

0.1262 0.1262 0.0262 0.2860

0.2377 0.2377 0.0693 0.2216

0.0568 0.0568 0.0190 0.3404

Table 3: Kolmogorov-Smirnov statistics for loss severity data. Relationship

Human

Processes

Technology

External

Exponential Lognormal Stable Pareto

0.3447 0.3090 0.1020 0.3090

One-sided distributions 0.2726 0.2914 0.2550 0.2575 0.0944 0.1090 0.2222 0.2649

0.2821 0.2821 0.1153 0.2821

0.3409 0.3409 0.1013 0.3409

Student t Cauchy Stable

0.3083 0.2643 0.1234

Two-sided distributions 0.3070 0.3623 0.2630 0.3252 0.1068 0.0899

0.4077 0.3924 0.0931

0.3523 0.3157 0.1084

Table 4: Kolmogorov-Smirnov statistics for loss frequency data. Relationship

Human

Processes

Technology

External

Exponential Lognormal Stable Pareto

1.6763 1.1354 0.2256 1.1127

One-sided distributions 13.592 2.3713 0.9823 1.2383 0.2149 0.2740 0.4614 1.3883

3.4130 2.8887 0.3318 2.3497

2.0864 1.5302 0.2302 1.9164

Student t Cauchy Stable

7.5618 1.8138 0.2469

Two-sided distributions 5.9271 23.275 1.4750 2.5744 0.2144 0.1798

47.953 4.7790 0.1882

12.510 2.5338 0.2169

Table 5: Anderson-Darling statistics for loss frequency data. Table 4 presents the KS statistics for the frequency distributions of the five loss event types. Among the one-sided distributions, Stable defines the best fit, and among the heavy-tailed two-sided distributions, the symmetric Stable shows uniformly the best fit with the lowest KS values. Fitting Exponential distribution results in a poor fit: the KS values are roughly 3 times higher than those of the heavy-tailed counterparts. AD statistics, Table 5, confirm the hypothesis that the thin-tailed Exponential distribution behaves poorly around the tails of the frequency distributions of all five types of losses. The heavy-tailed one-sided and two-sided Stable distributions again act as the best candidates for modeling the operational loss frequency distribution. Example in Figure 4 illustrates the better fit of the two-sided symmetric Stable distribution (bottom

QQ−plot: "Relationship" Losses

QQ−plot: "Human" Losses

3.2

QQ−plot: "Legal" Losses

3.5

2.8

3.1 2.75

3 2.7

2.9

2.8

2.7

2.6

log−Stable Quantiles

log−Stable Quantiles

log−Stable Quantiles

3 2.65

2.6

2.55

2.5

2.5 2.5

2.4 2.45

2.3

2.2

2

2.5

2

3

2

2.5

Empirical Quantiles

2.4 2.45

3

2.5

2.55

Empirical Quantiles

QQ−plot: "Processes" Losses

2.65

2.7

2.75

(c) Legal

(b) Human

(a) Relationship

2.6 Empirical Quantiles

QQ−plot: "Technology" Losses

QQ−plot: "External" Losses

3.2

3.2

3.2

3.1

3.1

3.1

3

3

3

2.9

2.8

2.7

log−Stable Quantiles

2.9 log−Stable Quantiles

log−Stable Quantiles

2.9

2.8

2.7

2.6

2.6

2.5

2.5

2.4

2.4

2.8

2.7

2.6

2.5

2.3 2.3

2.4

2.5

2.6

2.7 2.8 Empirical Quantiles

2.9

3

3.1

3.2

2.3 2.2

2.4

2.3

2.3

2.4

2.5

2.6 2.7 Empirical Quantiles

2.8

2.9

3

3.1

2.2 2.2

2.3

2.5

2.6 2.7 2.8 Empirical Quantiles

2.9

3

3.1

3.2

(f) External

(e) Technology

(d) Processes

2.4

Figure 2: QQ plots for loss magnitudes. LogStable fit.

of figure) for modeling ”External” operational loss frequency distribution, as opposed to the fit of the Exponential distribution (top of figure). The difference is most evident from the cdf and QQ plots: the symmetric Stable fitted cdf is much closer to the empirical cdf (the step plot) in the tails than the Exponential. On the QQ-plot the dots are more aligned on the straight line for the Stable case. Relationship

Human

Legal

Processes

Technology

External

LogStable 2-sided Stable

1.9993 0.6324

Severity data 1.9754 1.8594 0.5563 1.5648

1.9993 0.5485

2 0.2004

1.9977 0.5376

Stable 2-sided Stable

1.3722 1.5153

Frequency data 1.4939 1.5149 -

0.8001 0.8802

1.2734 1.4617

1.5201 1.5540

Table 6: Values of the stability index α. Table 6 gives the values for the Stable distributions that were fitted to the severity and frequency distributions. We conducted ARMA/GARCH tests (results are omitted in this paper) in which we tested whether the occurrences of the loss events have some dependence structure. ARMA(1,1) model was tested on the inter-arrival times data after the mean was subtracted from each respective data set. The ARMA(1,1) coefficients were significant for the ”Relationship”, ”Human” and ”External” types of the inter-arrival times, indicating on the possibility that the inter-arrival times are not i.i.d. Clustering of volatility was present for the ”Relationship”, ”Human” and ”Processes” type inter-arrival times, as suggested by the ARCH(1) and GARCH(1) coefficients being significantly different from zero. Furthermore, Gaussian and α-Stable

QQ−plot: "Relationship" Losses

QQ−plot: "Human" Losses

QQ−plot: "Legal" Losses

24

24

17

22

22

16

20

20

18

18

15

16

14

2−Stable Quantiles

2−Stable Quantiles

2−Stable Quantiles

14

16

14

12

12

10

10

8

8

13

12

11

10

6

8

10

12

14

16 18 Empirical Quantiles

20

22

6

24

9

8

6

8

10

12

14 16 Empirical Quantiles

18

20

22

7 11.5

24

12

12.5

13

QQ−plot: "Processes" Losses

QQ−plot: "Technology" Losses

25

14.5

15

15.5

20

22

24

(c) Legal

(b) Human

(a) Relationship

13.5 14 Empirical Quantiles

QQ−plot: "External" Losses

24

24

22

22

20 20

20

15

2−Stable Quantiles

2−Stable Quantiles

2−Stable Quantiles

18

16

14

18

16

14

12 12

10

10

8

10 10

12

14

16 18 Empirical Quantiles

20

22

6

24

8

10

12

14 16 Empirical Quantiles

18

20

8

22

8

10

12

16 18 Empirical Quantiles

(f) External

(e) Technology

(d) Processes

14

Figure 3: QQ-plots for operational loss magnitudes. Symmetric Stable fit.

QQ−plot. "External" Interarrival Times

CDF Plot: Exponential

PDF Plot: Exponential

250

1

0.07

0.9 0.06

200

0.8

Exponential Quantiles

0.7

0.6 0.04

F(x)

f(x) − Exponential Density

0.05

0.5

0.03

0.4

150

100

0.3

0.02

50

0.2 0.01

0.1

0

0

50

100

150 x − Interarrival Times

200

250

0

300

0

50

100

150 x − Interarrival Times

200

250

0

300

0

50

150 Empirical Quantiles

200

250

300

250

300

QQ−plot. "External" Interarrival Times

CDF Plot: 2−sided Stable

PDF Plot: 2−sided Stable

100

700

1 0.012

0.95

600

0.9

0.01

500

400

0.006

Stable Quantiles

0.8 F(x)

f(x) − 2−sided Stable Density

0.85 0.008

0.75

0.7 0.004

300

200

0.65

100 0.6 0.002

0

0.55

0

−400

−200

0 200 x − Interarrival Times

400

600

800

0.5

0

50

100

150 x − Interarrival Times

200

250

300

−100

0

50

100

150 Empirical Quantiles

200

Figure 4: Example: ”External” inter-arrival times. Top: Exponential fit; bottom: symmetric Stable fit.

distributions were fit to the standardized residuals from the ARMA(1,1) models. α-Stable showed a better fit (with the stability index above 1 in all cases) for the excess inter-arrival times for all cases.

6

Value-at-Risk and Expected Tail Loss

Value-at-Risk determines the worst amount that one can expect to lose with a certain probability over a given time horizon. Hence, VaR determines the (1 − α)th percentile of the cumulative loss distribution function: VaRα = G−1 (1 − α) = sup{x : G(x) ≤ 1 − α} where G−1 is the inverted loss distribution function and (1−α) refers to some high percentile such as 95-99%. VaR can be estimated using one of the following procedures [21]: parametric, historical simulation, Monte Carlo simulation, and stress testing. To test the accuracy of the VaR model, backtesting is required. Banks usually perform backtesting on the monthly or quarterly basis. Banks are penalized if their model does not perform sufficiently well. One common test involves comparing the number of exceedances beyond VaR under the model with the actual number of exceedances in the data. For example, in credit risk VaR based on 255 trading days, for 99% VaR, the acceptance rule is N ≤ 5 at the 95% confidence level, where N denotes the count of exceedances. Similar rules need to be established for operational risk. BIS suggests using the green, orange and red zones for anomalies. Other tests may include comparing the magnitudes of violations, mean cluster size, estimation of the size of under/over-allocation of capital, Kupiec likelihood ratio test, Crnkovic-Drachman Q-test, among others (for the discussion of these and other issues, see [7]). The VaR procedure for estimating the operational risk capital charge has serious drawbacks. One serious critique of VaR is that it merely determines the lower bound for extreme losses given the pre-specified conditions, namely the confidence level and time horizon. It is unable to provide information regarding the extent and the distribution of the losses that lie beyond the high quantile. Moreover, the more heavy-tailed the distribution, the more likely that the capital charge would be underestimated. As an alternative to using VaR, it is suggested to use the Expected Tail Loss (ETL) (also called Tail VaR, Expected Shortfall, Conditional VaR). ETL at the level α is estimated as the expected value of loss given that it exceeds VaR: ETLα = E[X| X > VaRα ] ETL can be estimated parametrically and non-parametrically. The main advantage of using ETL instead of VaR is that it accounts for the form of the losses that belong to the upper tail of the distribution. This results in more conservative estimates of the capital charge. Another advantage of ETL over VaR is that it satisfies the subadditivity property, that often fails for VaR. Discussed in [1], coherent risk measures should satisfy four axioms: (1) translation invariance, (2) subadditivity, (3) positive homogeneity, and (4) monotonicity. Subadditivity essentially states that diversification of portfolio decreases overall risk: ρ(X1 + X2 ) ≤ ρ(X1 ) + ρ(X2 )

(39)

Subadditivity of ETL ensures that the capital charge is not overstated.

7

Summary and Future Work

In this paper we have provided empirical evidence against the uniform usage of the Lognormal and Poisson distributions for the severity and frequency distributions in the operational risk modeling. Lognormal distribution, suggested by BIS to model the loss magnitudes, possesses relatively thin tails. Section 5 demonstrated that thin-tailed distributions give a poor fit, and leave out the large-magnitude losses from consideration. α-Stable distributions provide a good fit. Same applies to the frequency distribution: BIS suggested that the frequency distribution should be assumed Poisson, implying the Exponential distribution of the inter-arrival times process. Such distribution is again thin-tailed. We discussed in Sections 4.3 and 4.2 how the simple compound Poisson process can be generalized to compound Cox process and compound Mixed Poisson processes, that allow for a random intensity factor. The nature of the intensity process can result in the inter-arrival times following thin or heavy-tailed distributions. Our empirical results of Section 5 illustrated that the thin-tailed distributions such as Exponential do not account for the heavy tails in the frequency distribution. α-Stable distributions

serve as reasonable candidates to model the frequency distributions of the operational losses. The limiting behavior of a compound process with heavy-tailed inter-arrival times and heavy-tailed generating process would be that of a self-similar process. Moreover, the limit is again a heavy-tailed process. In our overview of the stochastic models, we showed that if both, the frequency and the severity distributions, are heavy-tailed, then so is the entire compound operational loss process. This has a serious implication for the use of VaR in determining the operational risk capital charge, as VaR is directly related to the tail behavior of the compound loss process. To account for the distribution of the tails, we recommend using ETL. ETL results in a more conservative capital charge. One direction for the future work would be conducting a more detailed time-series analysis of the frequencies and severities, in order to analyze a possible dependence between the losses in time. It is very likely that some seasonality or clustering would be present in the data: for example, if a bank recruits new employees around the same time of the year, then it is likely that around that time there is an increased number of human errors that result in operational losses. Another example: if an operational loss occurs due to a system-related error or a legal issue, then it is likely that it will be followed by a whole series of events of a similar nature, that would result in losses. Another intriguing issue would be analyzing the operational risk in the framework of the Integrated Risk Management. Fama [9] identified three forms of the capital market efficiency: (1) weak form, (2) semi-strong form, and (3) strong form. They differ by the scope of information that is believed to be incorporated in the stock prices: past (weak-form), past and public (semi-strong form), and past, public and private (strong form), respectively. We believe that the theory can be generalized to the analysis of the effects of market, credit and operational risk on the stock market and other financial indicators’ volatility. In the stock price fluctuations, it is generally believed that small fluctuations are attributed to the market risk, larger fluctuations are due to the credit risk, and the most dramatic fluctuations are generally due to the operational risk-related events. Financial institutions are believed to hide a vast number of operational losses from the public. However, if the capital markets are assumed to be efficient up to the strong form, then even the private information about the operational losses, that is often publicly undisclosed, has already been appropriately reflected in the stock prices. For the analysis of the integration of risks, we consider truncating the stock price data at various threshold levels and investigate the changes in the regimes, for both severity and frequency. Using the time series analysis, it would be also possible to check for the i.i.d. assumption at each stage - the i.i.d. in the data is likely to be achieved at the level beyond which the stock price fluctuations are due to the excess operational losses. Again, the α-Stable distributions can be expected to provide a good fit for the largest fluctuations in the financial data.

Acknowledgements We are grateful to S. Tr¨ uck for providing us with the data on which the empirical study was based. We are also thankful to S. Stoyanov of the Bravo Risk Management Group for computational assistance.

References [1] P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of risk. Mathematical Finance, 9:203–228, 1999. [2] V. E. Bening and V. Y. Korolev. Generalized Poisson Models and their Applications in Insurance and Finance. VSP International Science Publishers, Utrecht, Boston, 2002. [3] BIS. Consultative document: operational risk. www.bis.org, 2001. [4] BIS. Working paper on the regulatory treatment of operational risk. www.bis.org, 2001. [5] BIS. The 2002 loss data collection exercise for operational risk: summary of the data collected. www.bis.org, 2003. [6] M. Crouhy, D. Galai, and R. Mark. Risk Management. McGraw-Hill, New York, 2001.

[7] M. G. Cruz. Modeling, Measuring and Hedging Operational Risk. John Wiley & Sons, New York, Chichester, 2002. [8] P. Embrechts, C. Kl¨ uppelberg, and T. Mikosch. Modeling Extremal Events for Insurance and Finance. Springer-Verlag, Berlin, 1997. [9] E. Fama. Efficient capital markets: a review of theory and empirical work. Journal of Finance, 25:383– 417, 1970. [10] J. Grandell. Aspects of Risk Theory. Springer-Verlag, New York, 1991. [11] J. Grandell. Mixed Poisson Processes. Chapman & Hall, London, 1997. [12] P. Jorion. Value-at-Risk: the New Benchmark for Managing Financial Risk. McGraw-Hill, New York, second edition, 2000. [13] R. B. Leipnik. On lognormal random variables: I - the characteristic function. Journal of the Australian Mathematical Society, Series B, 32:327–347, 1991. [14] J. B. Levy and M. S. Taqqu. On renewal processes having stable inter-renewal intervals and stable rewards. Les Annales des Sciences Math´ematiques du Qu´ebec, 11(1):95–110, 1987. [15] J. B. Levy and M. S. Taqqu. Renewal reward processes with heavy-tailed interrenewal times and heavy-tailed rewards. Bernoulli, 6(1):23–44, 2000. [16] V. Pipiras, M. S. Taqqu, and J. B. Levy. Slow, fast and arbitrary growth conditions for renewal-reward processes when both the renewals and the rewards are heavy-tailed. Bernoulli, 10(1):121–163, 2004. [17] S. T. Rachev, editor. Handbook of Heavy Tailed Distributions in Finance. Book 1. In North Holland handbooks of finance (Series editor W. T. Ziemba). Elsevier, Amsterdam, Boston, London, New York, 2003. [18] S. T. Rachev and S. Mittnik. Stable Paretian Models in Finance. John Wiley & Sons, New York, 2000. [19] S. T. Rachev, S. Mittnik, and M. Paolella. Stable paretian modeling in finance: Some empirical and theoretical aspects. In R. J. Adler, R. E. Feldman, and M. S. Taqqu, editors, A Practical Guide to Heavy Tails: Statistical Techniques and Applications, pages 79–110, Boston, 1998. Birkh¨auser. [20] S. I. Resnick. Extreme Values, Regular Variation, and Point Processes. Springer-Verlag, New York, 1987. [21] RiskMetricsr. Return to riskmetrics: the evolution of a standard. www.riskmetrics.com, 2001. [22] G. Samorodnitsky and M. S. Taqqu. Stable Non-Gaussian Random Processes. Stochastic Models with Infinite Variance. Chapman & Hall, London, 1994. [23] O. Thorin and N. Wikstad. Calculation of ruin probabilities when the claim distribution is lognormal. Astin Bulletin, 9:231–246, 1977. [24] G. Willmot. Asymptotic tail behaviour of poisson mixtures with applications. Advances in Applied Probability, 22:147–159, 1990.