index-based insurance is currently receiving much attention. ... Simulations of index-based insurance show that, while a parametric form can fit the ideal.
Risk minimizing index functions for price-weather insurance, with application to rural Ghana 1 Michiel Keyzer Vasco Molini Bart van den Boom March, 2007 Abstract Poor farmers find it difficult to cope with price-weather shocks through self-insurance, because they cannot afford to keep large stocks and to protect their crops through irrigation and other measures. Mutual insurance is not an option either, because all participants would be faced with the same price-weather conditions at the same time. The next option of market insurance is plagued by excessive monitoring cost in avoidance of moral hazard and adverse selection. Consequently, new types of insurance are needed. Among the arrangements suggested, index-based insurance is currently receiving much attention. Index-based insurance offers an indemnification according to an index function that depends on agreed upon price and weather conditions rather than on an assessment of damage at individual farm level. Existing proposals and experiments present a synthetic index function whose effectiveness is established by assessing its capacity to stabilize revenues on the historical record. The present paper proposes an approach that is different in that it enables the insurer to offer an indemnification that is optimal from the perspective of the farmer in preventing a fall below a specified poverty line and is self-financing up to a given subsidy. To this effect, we develop and apply a model that minimizes farmers’ risk of receiving an inadequate indemnification. The approach builds on methods from catastrophic risk management in insurance and support vector regression in statistics. It is applied to Ghana, where according to our database compiled for the period 1980-2005 on average 44 percent of the farm population fell below the poverty line. Simulations of index-based insurance show that, while a parametric form can fit the ideal indemnification fairly well (reducing poverty by about 7.5 per cent points), semi-parametric forms can perform better reducing poverty by another 15-25 per cent points, depending on the regularization.
JEL codes G22, I30, Q12, R51 Keywords Poverty, Catastrophic risk, Ideal asset, SV-regression, Index-based insurance, Ghana
1
Paper written for the project "Can Weather and Price Insurance Reduce Rural Poverty in Africa?". The authors are affiliated to SOW-VU, Centre for World Food Studies, Vrije Universiteit, Amsterdam, The Netherlands. They want to acknowledge support from The World Bank (Research Grant no. 7131322) and from the Ghana Statistical Service that kindly provided access to the data, assisted in compiling the price series and organized a seminar in July 2006 to present the work to the public in Ghana. Particular thanks go to the Acting Government Statistician, Dr. N.N.N. Nsowah-Nuamah. Also, comments by Dr. Ernest Aryetee and Dr. Wouter Zant on an earlier version are gratefully acknowledged.
2
1.
Introduction
For most farmers in Sub-Saharan Africa (SSA), crop yields are generally low and subject to climatic variation due to erratic rainfall patterns and frequent outbreaks of pests and diseases. Consequently, households live with the permanent threat of crop failure (Dercon, 1996). Moreover, the prices of major tropical crops produced in the area tend to fluctuate widely from year to year, following world prices on markets where supply variations face an inelastic demand and obeying patterns that are largely independent from local crop conditions, and, therefore, do not compensate for poor harvests (Fafchamps, 1992). In principle, farmers can address these shocks in various ways. One is self-insurance, whereby they take various measures in the physical sphere, such as keeping cash crops in stock until a price recovery has taken place, depleting staple food inventories in dry periods, irrigating dry fields, slaughtering cattle, and selling off jewels. Clearly, these options are hardly within reach of poor farmers, who would have to choose between eating the seeds of the next planting season and selling off the last asset. Mutual insurance is not a promising option either when it comes to dealing with price-weather shocks, because all farmers in the collective are affected by the same price and weather adversities at the same time. Market insurance is on its part plagued by excessive monitoring costs. Measuring damages at individual level is costly, while verifying the identity of the owner of a particular insurance contract and the land to which the contract applies is usually cumbersome as well. Furthermore, the insured farmer may not pay the premium due, or do this too late, when price and weather conditions are no longer uncertain. Finally, the insurance company itself might default on its obligations. Because of this high cost of transactions and the inevitable moral hazard, poor farmers often find themselves trapped in poverty, unable to take advantage of upcoming profitable opportunities, because the risk is too high for them to take a chance. This situation has become a major concern of the development community, particularly at a time that arrangements to control the markets directly, through prices and stock management, have been discarded as less effective. There is a growing literature on pathways to insure the poor better, arguing that a broadening of saftety nets is warranted, not only for equity reasons, but increasingly so to overcome welfare losses and poverty traps caused by farmers’ response to uninsured risk (Dercon, 2005). The government authorities in developing countries have also become very much aware of the issues, and put new policies in place. In urban areas, targeted instruments such as health insurance and free provision of drugs have been introduced for this purpose. However, in rural areas, where few public facilities are available, more general arrangements such as crop insurance seem preferable. As compared to purely publicly funded schemes, possibly subsidized market insurance offers the advantage that the insured groups pay some premium themselves. Besides alleviating the pressures on the already overburdened national budgets, this would reduce free riding on arrangements, since local participants who pay their dues will tend to exercise some countervailing pressure when they see that others benefit more than they deserve contractually. Also, by creating pools of policy holders with different risk profiles, say, of farmers with different cropping patterns, market insurance offers possibilities for mutual insurance at above village level, and also for solidarity between population groups, all this of course within the limits set by political realities. Furthermore, at international level, market insurance is considered an interesting option, because the economies in rural SSA are small by international standards, which makes reinsurance on the international financial markets relatively easy. Finally, even when subsidized, crop insurance arrangements are considered less distortionary than many other farm support measures and so far enjoy a Green Box status in the WTO.
3
Nonetheless, the issue has to be addressed that crop insurance suffers from the classic problems of adverse selection and moral hazard, and has high costs of administration and monitoring. In this connection, index-based insurance has been proposed as an adaptation that is better tailored to developing country circumstances (Skees et al. 1999; Skees et al. 2005). The idea is to condition indemnity payments on variables that are independent from both farmers and insurers decisions but sufficiently correlated with farm income, nonetheless. The present paper aims to contribute to the existing literature on index-based insurance and social safety nets. Whereas the index-based insurance arrangements proposed so far offer indemnification according to some a priori postulated function whose performance is being assessed, we present a methodology and an application to design an indemnification schedule that is optimally adapted to specified needs of farm groups participating in a given risk pool, and that is self-financing up to a specified subsidy, to the extent this can be achieved by a policy that uniformly applies to all policy holders in the pool and is indexed on few, relatively easily observable variables prices, rainfall and land area giving up by the same token the option of full adaptation to individual needs. One reason is that, with such a policy, all individuals in a region will face the same price and weather conditions, and there will be many individuals with (about) the same land area but, say, different cropping patterns, yields, and personal situations. An index function will, whatever its shape, only compute a single indemnification for all of them. The second reason is that the function has to apply in the future, under circumstances other than those found in the historical record. Therefore, avoidance of overfitting is important, whereby the function may nicely fit the past but performs poorly under price and weather conditions beyond the historical record. To deal with both aspects, we adapt semiparametric techniques of Support Vector (SV-) regression (Vapnik, 1998; Schoelkopf and Smola, 2002; Herbrich, 2002). The procedure optimally fits to the needs of individual farmers in that it minimizes their basis risk within a selffinancing framework and deals with overfitting by extending the estimation problem with out of sample constraints, and by a penalization of flexibility (regularization). The remainder of the paper is organized as follows. Section 2 reviews the literature on index-based insurance. Section 3 deals with the theory. It casts the farmer’s problem in risk minimization format with special reference to avoidance of shortfalls below the poverty line and adapts an SV-regression to obtain a program for estimating the indemnification schedule. Section 4, after indicating how the data base for Ghana was compiled, discusses computation of the ideal individualized indemnification, for three possible insurance pools and related premium payments. For this, we construct a data set of hundred representative farm households, based on data from four rounds of the GLSS (GSS, 2000; Deaton, 1985), and simulate the income of each, under 26 states of the world constructed on the basis of time series of rainfall, prices and yields for 18 different crops. In Section 5, using these indemnification profiles and premiums as data, we estimate indemnification functions, comparing parametric forms with semi-parametric ones, under various additional out of sample constraints and regularization weights, with their ability to reduce poverty as main performance criterion. Section 6 concludes. 2. Literature: Index-based insurance in developing countries Rather than compensating for a damage assessed on at individual farm level after local inspection, index-based insurance pays out when an agreed upon indicator meets an agreed upon condition. Classical examples are arrangements triggered by the recorded rainfall at a weather
4
station falling below a certain threshold, 2 or, by the price at a local market, a port or some other relevant (international) exchange falling below a floor level. In case both weather and prices have to be accounted for simultaneously, some more elaborate schedule, in fact a function, has to be constructed that can generate indemnification payments under all possible price and weather conditions (Zant, 2005). Reviews of revenue insurance (e.g. Mahul and Wright, 2003) indicate, however, that it is not easy to piece together by synthetic means (as opposed to statistical estimation) a function that predicts well actual individual damages. Yet, as mentioned in the introduction, if a function with this property could be designed, index-based insurance becomes an attractive alternative for traditional crop insurance: it reduces moral hazard and adverse selection, lowers administrative and monitoring costs, and has an enhanced scope for dealing with covariate risks through reinsurance on capital markets. Furthermore, the index-based insurance is transparent for policy holders in the sense that indemnity payments are not individualized. However, index-based insurance, due to the anchoring on a limited number of variables (functions with few dependent variables), the fact that these functions (contracts) are not individualized, and the quantification of the functions on the basis of past data, necessarily leaves part of the risk (basis risk) with the farmer. This basis risk might be particularly large when situations can occur where the payment predicted by the index grossly underestimates the actual damage (Goodwin and Mahul, 2004; Barnett et al., 2006). It will also be higher when the indemnification schedule is poorly adapted to the needs of the policy holders, which is likely to be the case for functions that are postulated on a priori grounds and applied to a collective of farmers with varying, as found in most applications. Indeed, there is a danger of the lower implementation cost to come with an upsurge of basis risk, and farmers’ willingness to buy index-based insurance will naturally be contingent on the specific design of the contract, in particular on whether the contract provides sufficiently high correlation between the index and individual damages, and on modalities of premium payments. Index-based insurance is now piloted in several developing countries. For Mexico, Skees et al. (2002) report on private and public mechanisms for managing covariate risk for natural disasters. The aim is to find proper tools to cover covariate weather risk for a community of poor households. A weather insurance contract is proposed to small farmers, based on an existing procedure for indexation. The mechanism proposed is the grouping of farmers and the subscription of insurance as a group with an ex ante mechanism to distribute payments within the group. For India, Kalavakonda and Mahul (2005) examine the performance of a public index based insurance in Karnataka, the second driest state of India in the South of the country. The paper reviews weaknesses in design, challenges in implementation, and operational problems, and reports that the crop insurance scheme performs poorly both in terms of coverage (number of hectares insured and number of farmers purchasing insurance) and financial performance. In another study in Andhra Pradesh, Veeramani et al. (2005) find similar problems of fitting a weather based index to the needs of poor farmers. The Skees et al. (2001) study for Morocco assesses the feasibility of rainfall-based insurance to provide effective, low-cost drought insurance for farmers and rural dwellers. It focuses on three main cereal crops using data on annual production and planting from 1978-99, 2
Alternatively, index-based insurance is sometimes based on (area-based) yields, though yield patterns are generally more difficult to verify than rainfall patterns. Likewise, to account more explicitly for the suitability of a particular rainfall pattern for crop production, one might base insurance on the length of growing period.
5
and compares the benefits of a rainfall-indexed insurance over the traditional insurance scheme. The proposed insurance is considered convenient since the correlation between rainfall and cereal revenues is found to be rather strong. Insurance contracts would pay the farmers an amount proportionate to the shortfall in actual rainfall during a set period compared with the trigger rainfall. Hess and Syroka (2005) for Malawi, investigate the feasibility of a weather insurance scheme. They construct a maize specific rainfall index as a weighted sum of cumulative rainfall during the 130-day growing period of maize, and link farmers’ incomes to weather fluctuations by regressing incomes on maize yields and on the rainfall index. This function is used to compute the marginal cost of a unit reduction in the rainfall index. Next, using a monthly rainfall trigger, a payout structure is constructed to compensate farmers for income loss. To calculate the premium, the payout of the contract is averaged on the basis of historical rainfall data, and incremented by a nominal fee to cover set up costs. Efforts to implement the program are currently underway. In summary, most applications are still in a pilot phase or subject to revision but experience gained until present would seem to indicate that poor farmers’ remain somewhat reluctant to buy index-based insurance, despite significant subsidies often offered on the premiums. One reason might be that based on the indemnifications paid so far, they consider the basis risk relatively high under the proposed schemes, another that they do not see how the current participation by their neighbors could ever overcome the problems of covariate risk they so often faced under mutual insurance arrangements, hence our attempt to design an insurance that minimizes basis risk and can be specified for any pool of farmers willing to share risk and for various levels of self-financing.
3. Theory: Optimal indemnification 3.1 Optimal indemnification of a single policy holder under catastrophic risk We start generating an optimal indemnification schedule as derived from risk minimizing approach under catastrophic risk. The idea is that the insurance to be considered is not meant to cushion idiosyncratic shocks affecting individuals within a community but rather to avoid ruin for all poor farmers in a particular population group, under assumed solidarity (risk pooling) with richer farmers. Indemnification inclusive per capita income or revenue r( z,τ ,ε ) is defined as a function of (deterministic) decisions z ∈ Z ⊂ R m , where Z is a nonempty, compact and convex set, of an institutional parameter (the premium) τ ∈ R+ , and of stochastic events ε ∈ R n that obey a multivariate density function g . Ruin is expressed as a given income threshold r below which poverty becomes extreme. Decisions z comprise the entire risk management strategy of the farm household, inclusive of self- and mutual insurance as well as commercial insurance, if available, but exclusive of the institutional parameter τ , which the agent must take as given. As in Ermoliev et al. (2000), the farmer maximizes expected utility, while trying to avoid disaster:
maxz∈Z Eu( r( z,τ ,ε )) + ρ E min( u( r( z,τ ,ε ) − u( r ),0 ) ,
(3.1)
where u is a concave increasing utility function, expectation E is taken with respect to ε , and ρ is a given and high penalty coefficient, reflecting that the household wants the ruin to be avoided
6
by all means. Therefore, we may when considering the optimal insurance limit attention to the minimization of expected income shortfalls:
maxz∈Z E min( u( r( z,τ ,ε ) − u( r ),0 ) ,
(3.2)
We treat the purchase of insurance as a discrete risk management strategy z = 1 , and henceforth drop the choice variable z. This is possible because we are concerned with the design of an ideal insurance that operates as last resort and is unrestricted by the payoff structure of any particular asset. Consequently, when providing such insurance, we may consider all other risk coping decisions to have been made already. This ideal insurance provides access to an indemnification profile, denoted by function v + , for a given premium τ . For given poverty line r , given profile of income h( ε ) before indemnification, 0 ≤ h( ε ) ≤ h , and given premium τ , function v + defines the indemnification needed to supplement farmer’s income in case of events ε that bring his income below the poverty line as:
v + ( τ ,ε ) = max( r + τ − h( ε ),0 ) .
(3.3)
Under this arrangement, insured income:
r( τ ,ε ) = h( ε ) + v + ( τ ,ε ) − τ
(3.4)
will never fall below the poverty line r , whereby the expected income shortfall in (3.2) will be zero. However, this is for given premium. To keep the arrangement self-financing up to a (net) subsidy σ covering possible implementation cost as well as part of the indemnification payments, the premium has to be such that the deficit
F( τ ) = ∫ v + ( τ ,ε )g( ε )d ε − τ
(3.5)
is exactly covered by the subsidy
F( τ ) = σ .
(3.6)
We note that, for given subsidy level, it might be impossible to meet this self-financing constraint. After all, the insurance can be no more than a smoothing device, implying that two feasibility requirements must be met: (i) the subsidy must be sufficient to cover the expected income deficit; (ii) it should not be so high that it leaves a surplus after all indemnification payments have been made. Assumption 1 (feasibility of self-financing insurance that eliminates poverty): (i) Subsidy σ is sufficient to cover the shortfall of expected income below the poverty line: σ ≥ r − ∫ h( ε )g( ε )d ε . (ii) Subsidy σ does not surpass the need for indemnification: σ ≤ ∫ max( r − h( ε ),0 )g( ε )d ε .
7
Note that both conditions are irrespective of the premium τ . Indeed, since gross revenue r( τ ,ε ) in (3.4) is continuous, nonincreasing in τ for every ε and decreasing until the premium reaches its maximum τ = max( h − r ,0 ) , it follows that F( τ ) is decreasing . Also, since F( τ ) ≤ σ , by Assumption 1.(i), while F( 0 ) ≥ σ by Assumption 1.(ii), the solution F( τ ) = σ exists and is unique. This is illustrated in Figure 1., at the point where the downsloaping curve F( τ ) intersects with the horizontal line σ . Figure 1: Self-financing and subsidy of an ideal insurance against poverty
F( 0 )
σ
τ
F( τ )
3.2
Risk pooling
Next, we allow for solidarity through risk pooling among policy holders but still deal with ideal market insurance in that every insured farmer holds a fully individualized contract. To represent this, we distinguish groups indexed i, consisting of N i individuals with per capita income profile is 0 ≤ hi ( ε ) ≤ hi , poverty line is r i , and premium τ i that provides access to indemnification vi + . We allow for a differentiation of the premium across groups through a flat premium per hectare τ with corresponding per capita premium τ i = γ iτ in accordance with per capita farm size
γ i > 0 . The indemnification profile is now vi + ( τ ,ε ) = max( r i −hi ( ε ) + γ iτ ,0 )
(3.7)
and leads to the deficit for the insurer:
F( τ ) = ∑ i ni
(∫ v
i
+
)
( τ ,ε )g( ε )d ε − γ iτ ,
(3.8)
where ni = N i / ∑ i N i is the share of group i in total population. This deficit is to be covered from the subsidy, as in (3.7):
F( τ ) = σ .
(3.9)
The equivalent of Assumption 1 now reads: Assumption 2 (feasibility of self-financing risk pooling insurance that eliminates poverty):
8
(i) Subsidy σ is sufficient to cover the shortfall of expected income below the poverty line:
(
)
σ ≥ ∑ i ni r i − ∫ hi ( ε )g( ε )d ε .
(ii) Subsidy σ does not surpass the need for indemnification: σ ≤ ∑ i ni ∫ max[ r i −hi ( ε ),0 ] g( ε )d ε . By the same arguments as for the solution of (3.6) it now follows that F( τ ) is decreasing until the premium exceeds the maximum income in excess of the poverty line, and that (3.9) solves uniquely with a premium level τ . Hence, under Assumption 2, a group of farmers who pool their risk could steer clear of poverty by contributing a common premium per hectare which, in combination with an externally provided per capita subsidy, can finance all claims of those whose income falls below the poverty line. 3.3
Discrete states of nature
Next, to compute the ideal indemnification as data to which the estimation will seek adaptation, we shift from the continuous density g in (3.1)-(3.9), to a discrete state representation by an empirical distribution with states of nature indexed A , and A = 1,...,L , as observations inferred from the historical record. Expectation integrals will now be approximated as means over these states. The discrete state representation of the arrangement (3.8)-(3.9) considers state-specific per capita income hiA , which leads to indemnifications vi+A ( τ ) = max( r i −hiA + γ iτ ,0 ) and mean
(
)
1 n v + ( τ ) − γ iτ . Denoting the mean per capita farm size by ∑ i i ∑ A iA L κ = ∑ i niγ i > 0 , the ideal indemnification profile and premium can now be obtained from the
deficit F( τ ) =
linear program:
minτ ,v+ ,ζ iA
iA
≥0
1 ∑ ni ∑ s ζ iA L i
subject to
ζ iA ≥ r i − hiA − ( vi+A − γ iτ )
(3.10)
1 n v + = κτ + σ ∑ i i ∑ A iA L Indeed, under Assumption 2, this program will find a unique optimal premium and an indemnification profile with zero risk ζ iA = 0 , i.e. build the ideal asset against poverty over the given historical sample. 3.4 Optimal index-based insurance Given the optimal per hectare premium τ and ideally individualized per capita indemnification profile vi+A , computed in (3.10), the optimal index-based insurance seeks to obtain a function that fits these profiles vi+A on a vector xA of observed independent variables appearing in the index and generated by the same underlying distribution of events A . However, as this fitting is a risk
9
minimizing operation in its own right, both choices have to be looked at jointly, to account for tradeoffs and priorities. At this point, it may be recalled from the discussion in the introduction that the index insurance will never be able to eliminate all risk, first because it is represented by a nonindividualized function that, whatever its shape, can only compute a single indemnification for all participants of every group of a given farm size that faces the same weather and price conditions , and second because the function has to apply in an uncertain future, under circumstances other than those found in the historical record. This also suggests that statistical estimation would seem the natural way to proceed but we need to choose an estimation technique that is especially flexible, because unlike in the common situation of econometric estimation, the set of explanatory variables at hand (the variables of the index) is limited, and since individualized contracts are ruled out, the possibility of keeping track of fixed effects by household is ruled out. At the same time, again because the contract is to apply in the future, the pitfall must be avoided of overfitting to a limited data set of past observations. As estimation technique, we opt for Support-Vector (SV) regression, because of its capacity to adjust flexibly to any given data set, while addressing the issue of overfitting, and because it can naturally be merged with the risk-minimizing decisions on indemnification profiles specified sections 3.1 and 3.2. The aim is now to find optimal values of premium τ and indemnification profile vi+A , this time also as close as possible to the index function f ( x ) . We remark that, since we are not going to allow for individualized (i-specific) insurance contracts, all outcomes by group i are also to be considered as events. Hence, events comprise all states A for each group i. For notational ease, we denote them by s = 1,...,S with S = I ⋅ L . For later reference, we denote the group i associated to s by is . Skipping technicalities, we admit the following class of semiparametric functions, in accordance with the (semiparametric) Representer Theorem (Kimeldorf and Wahba, 1970): f ( x ) = ∑ s α s k( xs ,x ) + ∑ j β jφ j ( x )
(3.11)
where k is a given kernel function and φ j are given functions as well. The first term on the right hand side is the non-parametric term, the second the parametric one. The variables x that enter the index are indexed by s, and may, therefore, reflect local weather and price conditions and hence vary across participants. In a more general setting, farm-specific events such as illness could also be allowed for, provided a credible assessment of their occurrence can be agreed upon. Optimality of the proposed insurance will be expressed in terms of minimum risk, where the risk consists of both the income shortfalls and the fitting errors. In addition, the risk criterion is extended by a quadratic regularization term 21 ∑ s ∑ r α sα r k( xs ,xr ) multiplied by the scalar
factor λS . This term is essential to avoid overfitting and can be made to shrink as sample size increases, and to vanish asymptotically. We will return to this aspect below. The problem can be written as one of functional analysis (finding the optimal function rather than the optimal coefficients). After putting the factor ρ > 0 on the priority to avoidance of income shortfall, as in the basic program (3.1) for avoidance of catastrophic risk, the regularized risk minimizing problem can be written as the (feasible and bounded) quadratic program:
10
minτ ,v + ,ζ s
* s ≥0;ξ s ,ξ s ≥0 ,α s ,β j
ρ
1 1 n ζ + ∑ s nis ( ξ s + ξ s* ) ∑ s is s L L
+
λS 2
∑ ∑ α α k( x ,x s
s
r
r
s
r
)
subject to
ζ s ≥ r i − hs − ( vs+ − γ i τ ) s
s
1 n v + = κτ + σ ∑ s is s L vs+ = ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs ) + ξ s − ξ s* 1 L
∑
s
(3.12)
nis [ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs )] = κτ + σ
Next, we recall from (3.1) that for priority factor ρ sufficiently large, we can under Assumption 2, solve (3.10) ahead of (3.12), treating the optimal premium τ and indemnification profile vs+ as given data in (3.12), and limit attention to estimation of the parameters ( α , β ) . Then, writing y for per capita indemnification v + , defining σ = κτ + σ as the mean per capita contribution to the insurance budget, and allowing for a soft margin η (with given penalty factor ϑ ), we obtain an SV-regression problem, incremented with a financing constraint:
minξs ,η ≥0 ,α s ,β j
λ 1 nξ + S ∑ s is s L 2
∑ ∑ α α k( x ,x s
r
s
r
s
r
) + ϑη
subject to ys ≤ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs ) + ξ s + η
(3.13)
ys ≥ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs ) − ξ s − η 1 ∑ ni [ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs )] = σ L s s
Since the two inequality constraints are never effective simultaneously, we could replace the equality constraint of (3.12) by two inequality constraints and work with a single vector ξ of absolute value deviations to which the soft margin η is added. SV-regression in fact owes its name to this margin, which is introduced to avoid excessive focus on small errors (Huberrobustness, Huber, 1981). Together, the regularization term and the margin prevent each observation from demanding a non-zero value for its own parameter α s in the non-parametric term. We note that the parametric term could be used to represent an a priori schedule, possibly with some unknown coefficients to be estimated. This expression could be chosen as starting point, with the kernel terms measuring the inadequacy (or correction) of this schedule, whereas the soft margin and the estimation error reflect the basis risk that remains with the farmers, which, for large samples will approximate the risk that is unavoidable for the given anchors of the insurance.
11
Program (3.13) differs from the standard form of SV-regression because of the additional equality constraints representing the financing restriction. This possibility of inserting further restrictions is essential for index-based insurance, since it makes it possible to ensure that the proposed indemnification function will satisfy financing requirements, as opposed to imposing this restriction on the data only. Furthermore, other constraints can be imposed as well. First, solvency constraints would seem of relevance, because the financing constraint in (3.13) only requires the insurer to meet the contractual obligations in the mean and hence neglects the fact that the insurer should be able to meet all contractual obligations from the start in every year of the arrangement, and also during a period of adversity. Thus, the program (3.13) in its present form assumes that the arrangement benefits from a public guarantee, either from national government or from international donors, exempting it from solvency restrictions, which could be incorporated as limits on the cumulative net indemnifications over specified sub-periods, Similarly, such restrictions could be introduced to target the subsidy in favor of poor groups in a particular pool or to limit the net contribution of groups of relatively rich farmers. Computation From a practical perspective, the key feature of the program is that all its constraints are linear in choice variables and that the objective is linear-quadratic and convex. This enables us to solve it numerically by standard tools of quadratic programming, as usual in SV-regression, because all k and φ terms can be evaluated ahead of the program, and collected in the financing constraint, leaving expressions in α , β , η and ξ only. However, in many practical applications, including ours, the dimensionality of the program becomes an issue, as the (square) matrix with kernel elements becomes large. There are various ways out. One is to opt for a more powerful software tool, which does not seem advisable in our case since we seek to present a tool that could run on laptops in Africa. The second is to choose a kernel that more strongly penalizes dissimilarity (has a small window size) and hence has more zeroes in its kernel matrix but this discards interdependencies in the data. A third option is to limit the number of representative households, while adjusting their poverty threshold so as to respect the poverty measurement in the underlying data set. This is the approach follow here, and documented further in the next section. In addition, we also mention bagging as a fourth option. Bagging amounts to creation, by drawing (with or without replacement) from the data set, a series of random samples So ⊂ { 1,.....,S } from the full data set, while keeping the financing constraint, since it is a (a single equation that can, because of the factorization, be maintained for the full set. This leads to the program: minξs ,η ≥0 ,α s ,β j
∑
s∈So
nios ξ s +
λS
o
2
∑ ∑ s∈So
r∈So
α sα r k( xs ,xr ) + ϑη
subject to y s ≤ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs ) + ξ s + η ,
s ∈ So
y s ≥ ∑ r α r k( xr ,xs ) + ∑ j β jφ j ( xs ) − ξ s − η ,
s ∈ So
∑
r∈So
α r [ ∑ s∈S
nis L
k( xr ,xs )] + ∑ j β j [ ∑ s∈S
nis L
φ j ( xs )] = σ
(3.14)
12
where nios is the given weight on observation s, and terms in square brackets are evaluated
beforehand. This series generates a distribution of y-predictions and β -estimates, whose moments can be computed to test for stability and out-of-sample robustness, as is commonly done in SV-regression. Indeed, (3.14) is a natural complement of the estimation on the full sample. Consequently, in our case study for Ghana we keep a sample of size S = 2600 , which corresponds to around 3.4 million quadratic terms, of which around 1 million by appropriate choice of window size in the kernel matrix. After that we intend to establish the robustness of our estimates by expanding the constraint set with solvency constraints and by a series of subsamples in program (3.14) implemented with a broader window size that explores more interdependencies, but this is work in progress. Properties Before turning to the application, we briefly discuss to which extent the resulting estimator inherits the convergence properties of SV-regression estimators. The discussion refers to the full sample in (3.13) but applies to (3.14) as well, and considers a sequence of ever increasing full samples. It can be seen from (3.13) that even though the contract estimated by the function is not individualized, the program seeks to fit it fully to all observations at individual level. One question is then, whether it can achieve this, say, in the absence of soft margin, and when regularization vanishes, and the other, whether the sequence of functions f SA obtained by (3.13) for sets SA ⊂ { 1,.....,S } of iid observations indexed A , will for the size of the underlying data set L → ∞ , regularization factor λSA → 0 and soft margin penalty ϑ → ∞ converge (in specific sense) to the true function f. Regarding the first question, the semi-parametric form (3.11) owes its popularity to the capacity to fit given data, essentially because the matrix with elements [ k( xs ,xr )] , r,s ∈ So , which is known as the Gram matrix, is for any kernel function known to be positive semi-definite symmetric by definition, and to offer a measure proximity between two observations xs and xr . Specifically, while extreme proximity with identical x-values for all observations will reduce the rank of this Gram matrix to unity, extreme distance, with only the diagonal terms as non-zero values, leads to a diagonal form, and hence to full rank. Full rank in turn means that the function has full capacity to fit the given data set: y = Kα + Φβ can for given can for given y and Φβ always find an α -value that solves it. Yet, in our case of an index function with common anchors the rank will often be deficient, because there are several households sharing the same x-values. In an econometric application this might be seen as a limitation of the data set. Here this incapacity to match every individual’s needs is an inherent limitation of the construct of indexbased insurance with non-individualized contracts, The second question relates to the likely performance of the insurance arrangement in the future, and in fact asks how well the estimation procedure performs as a device for learning from past events. As this is a rather technical issue we only mention two properties inherited from SVregression. The first is that, as shown by Koenker and Bassett (1978), minimization of the sum of absolute values of errors in (3.15) amounts to estimating the conditional median (conditional on x), whereas minimization of sum of squared errors estimates as is well known the conditional mean. This is irrespective of the additional financing constraint that enters our problem. The second property is that the process converges to this conditional median (or any other quantile) for vanishing regularization factor, provided the reduction is not too fast and the soft margin disappears fast enough (Takeuchi et al., 2005; Norkin and Keyzer, 2007).
13
4.
Application to Ghana: data and computation of ideal individualized insurance
4.1
Data sources and summary statistics
We now proceed with an application for Ghana. It starts with the compilation of a database of representative farm households for 26 different states of the world. These reflect economic and weather conditions as these effectively occurred in the country as derived from the historical record. We treat them as iid (independent and identically distributed) observations from a stationary distribution. In our view, the relative strength of this dataset, as compared to one obtained from estimates of covariances within a single survey (e.g. Sarris 2002) would be that it includes a broader range of events. The data sources include the Ghana Living Standards Survey GLSS, covering the years 1987/88, 1988/89, 1991/92 and 1998/99 and the Population Census of 1970, 1984 and 2000 (GSS, 1987, 1989, 1995, 2001). In addition, for the period 1980 to 2005, time series data are available for monthly rainfall data at 40 stations throughout Ghana and of Accra-prices for all main crops (GMI, 2006; GSS 2005). To obtain the income profile of each representative household under the respective states of the world, we have derived, from the four rounds groups that show, on average, homogeneous characteristics across rounds, but several limitations of the data set came up. First, division of household’s crop production by area cultivated often gave yields that conflicted seriously with agronomic information (FAO 2005). This might be because farmers did not correctly report about their crop allocation over land. This inconsistency between the production section and land section of the questionnaires was reported earlier for cocoa (Teal and Vigneri, 2004) but appears to be even more frequent for all the other crops. In addition, data on agricultural income augmented with non-agricultural income, appeared to be poorly correlated with expenditures At the same time, land per capita proves to be highly correlated with expenditures in all survey rounds and its distribution relates well to the expenditures distribution. Furthermore, land per capita seems to be a good identifier when constructing homogenous groups over the four rounds since, unlike income and expenditures classes, holding size classes tend to vary little across the four rounds. Hence, we grouped households in quintiles of farm size per capita for each of Ghana’s 10 regions, arriving at 50 representative agents. This gives class bounds that are particularly stable over the four rounds, reflecting the stability of farming systems in Ghana. It also offers a more direct link than income distribution to the resource endowments and geographic location of farmers (Decaluwe et al 2001). After forming these groups, we linked them over the rounds. Next, households characteristics such as crop income, household size, land shares devoted to each crop were averaged, by group, over the 4 rounds. In this way, household groups with characteristics invariant across the 26 states of world were constructed as representative agents for the decennium (1988-1998). We remark that the need to rely on quasi-panel methodology is due to the specifics of the GLSS surveys that form a series of cross sections, rather than a true panel data set (GSS 2000). Yet, each round is considered to constitute a representative survey of the households in the country, and we exploit this property to attribute panel characteristics and link the 4 surveys using
14
the quasi-panel approach introduced by Deaton (1985) and frequently applied to link crosssection samples over time through age cohorts (Deaton, 1997; Beaudry and Green, 2000). Clearly, the transition from around 2000 farm households in the surveys to 50 representative agents amounts to a reduction in variability within groups. As mentioned in the previous section, we address this issue by maintaining major information on expenditures distribution within regions and quintiles, in two ways. First, we further subdivide the groups in two sub-groups say, one relatively poor, endowed with an amount of land per capita closer to lower bound of every quintile, and one richer, with an amount closer to the upper bound. This discards all other distinctive features across households within each quintile and region, but as discussed, maintaining differences in land holding size already keeps track of the major indicator of disparity. Second, given this split in two representative agents for every group denoted by the population fractions PG (poor or poverty rate based on per capita expenditures) and RG = 1 − PG (rich), their per capita land holding is obtained as:
γ PG = (1 − PG )γ LG + PGγ MG γ RG = PGγ UG + (1 − PG )γ MG , where subscripts L, M and U refer to lower, median and upper bound of every group. To the new representative agents is then attributed a new population weight. Finally, the poverty line of each subgroup is adjusted to reproduce the income shortfall of that group as measured in the original data set. To calculate yields and crop production we extracted information on the cropping pattern from survey rounds. Farmers cultivate a selection of 18 crops 3 in accordance to regional cropping patterns that reflect climatic variation in the country with predominance of tree crops like cocoa, oil palm and plantains in the south and of cereals like sorghum and millet in the North. To obtain yield in the different states of the world, we constructed time series of regional yields from climate and agronomic data (Chapagain and Hoekstra, 2004). These data were integrated with the information extrapolated from surveys on yield variability in order to maintain realistic differentiation in cropping patterns within and between groups. Non-crop income is the closing item of the income-expenditure account in our data base, which is essential as it enables us to account for all risk coping actions undertaken by farmers, using sources external to their main activity, through remittances, seasonal jobs outside agriculture, retail trade activities etc. Non-crop income (actually non-crop sources of financing expenditures) makes it possible to assume that the actual income profile as measured is inclusive of all other risk management strategies such as joining a mutual insurance, irrigating, modifying the crop composition. Indeed, as discussed in relation to ideal asset (3,2), the insurance can be interpreted as a proxy of an ideal financial asset precisely because it is conditional on farmers’ risk management strategies in all other domains, real as well as financial. Summary statistics are shown in Table 1. The regional distribution of total income and crop income, (column 1 and 2) strikingly highlights the sharp economic North-South divide in the country. Farmers in the three Northern regions (Northern, Upper East and Upper West) have an 3
Cocoa, Oil palm, Plantains, Bananas, Oranges, Ground nut/peanut, Pineapple, Cassava, Yam, Cocoyam, Maize, Rice, Sorghum/Millet/Guinea corn, Tomato, Okro, Garden egg, Beans and peas, Pepper.
15
average income below the 700.000 cedi poverty line (column 1 and 5), while in the rest of country the average income is above (Volta) or significantly exceeds this threshold. We also remark that on average crop income accounts for three quarters of total income of farm households. Spatially this share ranges between less than 50% in Greater Accra, to more than two thirds in the rest of the country, rising as one moves into the Western part of the country. Column 3 of the table indicates that on average shortfall below the poverty line occurs for some farm groups in all regions in some of the simulated states of the world, except in the Greater Accra region, where farm households are relatively rich and off-farm income plays a major role. Finally, columns 7 and 8 point to significant differences also in the rainfall pattern and consequently in the Length of Growing Period, LGP 4 , as is illustrated further in Figure 2. This is reflected in the variation of cropping pattern across the regions. In the North, Upper East and West, due to shorter and less dependable precipitation, farmers typically cultivate annual crops and choose varieties that need less water, such as sorghum, millet, beans and nuts. In the Southern half of the country where precipitation is more frequent, they turn to perennial crops such as cocoa, palm oil, yam, cocoyam and plantains.
4
Under rain-fed conditions the start of the LGP coincides with that of the rainy season. Precipitation exceeding half of the reference evapotranspiration is considered sufficient to meet the water requirements of a crop (FAO, 1979).
16
Table 1: Summary statistics on farm household characteristics in Ghana, by region Region
Income Crop Average Farm size (1000 cedi income (ha per Shortfall per (1000 cedi below the capita) capita) per poverty line capita ) (1000 cedi per capita) 1157 872 62 0.49 1297 1045 27 0.72 1077 939 133 0.61 974 688 98 0.39 1204 603 91 0.35 669 413 179 0.55 647 370 182 0.47 685 500 156 0.61 855 510 146 0.34 1552 1391 42 0.70 1040 781 103 0.53
Farm population (%)
Farmed land (%)
LGP annual crops (days)
LGP perennial crops (days)
Ashanti 8.6 18.4 217 Brong Ahafo 9.9 11.6 219 Central 9.1 9.8 165 Eastern 8.6 14.1 226 Greater Accra 6.7 2.4 112 Northern 13.4 10.3 219 Upper East 11.1 8.2 177 Upper West 13.5 4.1 180 Volta 8.4 10.4 232 Western 10.7 10.7 212 Ghana 100 100 196 Source: GSS (1989, 1995, 2000), FAO (1979) and authors calculations. Income figures are deflated to March 1999 Accra prices (GSS, 2000), the exchange at the time was 2394 cedi/$
Figure 1: Rainfall in Ghana: 1980 to 2005 National average, Ashanti Region, and Northern Region
1600 1400 1200 1000 800 1980
1985
1990 Ashanti National Average
1995
2000 Northern
Source: Ghana Meteorological Institute (2006) and authors’ calculations
2005
276 261 255 282 148 222 183 196 264 306 239
17
4.2
Ideal individualized indemnification for three types of insurance pools
To be of interest to farmers, an indemnification schedule should on the one hand be kept sufficiently flexible to fit their needs, as was discussed at length in section 3, and on the other hand apply to a pool of farmers whose participants of which are willing to share their risk. Consequently, the success of an initiative to introduce insurance will critically depend on its ability to build on existing expressions of solidarity as reflected in prevailing mutual arrangements. In this section, we restrict attention to three hypothetical insurance pools and compute the ideal premium and indemnification profiles needed to avoid all income shortfalls below the poverty line in the pseudo-panel constructed for Ghana, in accordance with Figure 1, equation (3.9) and program (3.10), for different financing constraints expressing the participation of groups i in the arrangement. In fact, the solution is obtained iteratively, through a Newton-Raphson line search. The corresponding per hectare premium τ is self-financing up to a given external subsidy σ . To take into account potential implementation costs, we augment the premium with a mark up of around 8%, as in Hess and Sykora (2004). This rate includes the profit margin necessary to attract international insurance companies, so as to allow for international reinsurance. In any case, the design of an ideal insurance requires the average income in a pool plus the net subsidy to equal at least the insured minimum income (Assumption 2). Consequently, an insurance against poverty for farmers who are structurally below the poverty line can only work in a pool that also comprises relatively rich farmers, or requires heavy financial support. The three arrangements considered are: a. Full solidarity. The pool consists of all farm households in the country, in this case no subsidy is required ( σ = 0 ). b. Regional solidarity. The pool includes all farmers living in the same region. Program (3.10) is now solved for each of the 10 regions, given a region-specific net subsidy σ R c. Group solidarity. The pool includes all farmers living in the same region and belonging to the same land per capita quintile. For each of the 50 groups indexed, the arrangement has a given group-specific per capita net subsidy σ G . Table 2 shows the premium and the degree of self-financing, resulting from the calculations for the three arrangements.
18
Table 2: Per hectare premium under group pool, regional pool and national pool Quintile 1 Quintile 2 Quintile 3 Quintile 4 Quintile 5 Regional National scheme scheme 299* 181 3 0 0 135 237 Ashanti 254* 815* 13 0 0 39 237 Brong-Ahafo 488* 546* 255 37 9 263 237 Central 313* 74 4 0 0 285 237 Eastern 275* 47 2 0 0 297 237 Greater Accra 2230* 726* 382* 127 0 950* 237 Northern 2814* 793* 265* 76 0 1069* 237 Upper East 2653* 695* 333* 391 6 956* 237 Upper West 410* 593* 0 0 0 565 237 Volta 283* 55 12 0 0 61 237 Western *figures with asterisk indicate the presence of a subsidy per capita on the premium paid **The regional subsidies amount to 50.000 cedi per capita in Upper East, 34.000 in Northern and 14.000 in Upper West region. ***The group subsidies range from zero and small values in the upper quintiles to as much as 500.000 in the first quintile of the poorest regions.
Source: GSS (1989, 1995, 2000) and authors’ calculations. a. Full solidarity. The case of full solidarity has nationwide participation (column 7 of Table 2) . Self-financing is feasible. The arrangement would operate as a social security fund financed from a fixed land tax. The fund would pay out to those who would otherwise fall into poverty. The self-financing ensures that the taxation is actuarially fair, in the sense that it suffices for the expected indemnification. The annual premium is around 237 cedi per hectare, or about 12% of estimated average per hectare income of 1,965,000 cedi in the period considered. b. Regional solidarity. Under a regional arrangement the premium varies strongly across regions (column 6). For example, for the three regions in the North, average income of all farmers is below the poverty line of 700,000 cedi per capita per year. In such a situation, it clearly is impossible to cover the risk of falling into poverty, and a minimum subsidy of about 10% of the average income is required to bridge the gap. While the South has only few cases of income shortfalls, almost all confined to the first two quintiles, in northern regions the frequent shortfalls imply that significant subsidies are needed on top of the premium. Relative to the national premium per hectare, the regional premium drops in three regions (Ashanti, Brong Ahafo and Western), while it slightly increases in Central, Eastern, Greater Accra and doubles in Volta, but without any need for external subsidy. By contrast, in the three northern regions perfect income smoothing and full regional solidarity are still insufficient to let farmers escape from poverty, even if all were to pool their full income above the poverty line within the insurance budget. Hence under regional insurance all farmers would, in the absence of subsidies, have to cluster well below the poverty line. c. Group pooling. As could be expected, the arrangement mainly affects the income distribution in the lower quintiles only. In the first quintiles all farmers are lifted above the poverty line thanks to the highly subsidized indemnifications (columns 1 and 2). From the 3rd quintile upward, excluding the three Northern regions, the indemnification by group is sustainable without requiring any subsidy. The three Northern regions have special conditions. The first to the third
19
quintiles that remain below the poverty line throughout, are always recipients of transfers. Only from the 4th quintile onwards farmers are able to contribute to the scheme, requiring a less than full subsidy. Finally, in almost all regions the 5th quintile never falls below the poverty line. Farmers in this category have little stake in acceding to insurance against poverty, since their income is already above the poverty line, even under the least favorable conditions.
5.
Index-based insurance: estimation and simulation
5.1
Specification
The present section reports on the results from estimation and simulation with index-based insurance schedules designed through semi-parametric regression of the ideal indemnification discussed above on specified price and weather variables and with farm size as basis for the contract, following programs (3.13) and (3.14). The software package described in Keyzer (2005) is used for computation of the estimates. Since the aim of the exercise is also to compare our approach with the synthetic forms in the literature (see section 2), we also discuss results from fully parametric regressions, which can be seen as intermediate in terms of flexibility between synthetic and semi-parametric forms. The data comprise I = 100 households under L = 26 states of nature, leading to a sample size of S = I ⋅ L = 2600 . As index variables we use the Length-of-Growing-Period ( x1 ), six prices ( x2 −7 ) covering both cash and staple crops (cocoa, cassava, yam, cocoyam, maize and tomatoes), and per capita farmsize x8 . The parametric form is postulated to be linear φ j ( x ) = x j with an additional φ0 ( x ) = 1 for the constant. The estimation proceeds in three steps, as in the back-fitting procedure described in Schoelkopf and Smola (2002): (i) estimate the parametric part with coefficients β ; (ii) keeping β fixed, estimate the coefficients α of the non-parametric part; (iii) joint estimation of α and, as in (3.13) and (3.14). To estimate a purely parametric index ∑ j β jφ j ( x ) we implement program (3.13) keeping
α = 0 , or, equivalently, taking the regularization factor so high that the non parametric part phases out. The program now defines a weighted Least-Absolute-Deviation (LAD) estimator of the insurance(e.g. Giloniaet al., 2006) but extended with financing constraint (3.9), and with a provision for a soft margin ( η -insensitive risk), that decomposes the error into a common term, the η -margin that avoids penalization of indistinguishable observations within a band, and the remaining idiosyncratic error. 5 Turning to the non-parametric part ∑ r α r k( xr ,x ) , we estimate α in (3.13), this time keeping β fixed. We make use of the Gaussian kernel with a window width that is half of the one that is optimal under Normally distributed samples (Haerdle, 1993). As discussed in section 3, this is done to keep the program (3.13) tractable in size, at the expense of a reduced capacity of accounting for interpendencies in the data. Recall that the level at which regularization factor λ is kept acts central lever to modulate the performance of the semi-parametric regression. Therefore, we scan over various λ -values, In fact, the computations use a fixed instead of a soft margin, set at η = 50 or 50,000 cedi per capita. The Lagrange multiplier it appeared that this level amounts to a penalization factor near unity.
5
20
starting from zero upwards to find the best value. At λ = 0 , we have over-fitting and maximal fit inside the sample, but the nonparametric part becomes “bumpy” with large positive as well as negative α -values, which tends to imply poor out-of-sample performance. At the other extreme, λ = ∞ , we return to purely parametric regression. We eventually select a λ -value as being optimal that is sufficiently high to reduce substantially the variability and the number of nonzero α s on the one hand, and not too high to loose the flexibility of the semi-parametric function (3.11) on the other. Specifically, the path for scanning is: λ( 1) = 0 ; λ( 2 ) = δ ; λ( n ) = δ 2n−2 for n = 3,4,5,6 .
5.2
Results from estimation
Table 3 shows the parameter estimates of a purely parametric index-function (putting α to zero), and Table 4 the results for the semi-parametric index using the back fitting procedure (keeping β fixed); in Table 5, after having selected the optimal λ -value, we report on results of the joint estimation of α and β . The three tables refer to the optimal indemnification contract of the national pool only (Table 2, last column). Results for contracts at the region- and the quintilelevel are qualitatively similar, with noted differences with respect to the level of self-financing (see Table 2) and a much more accentuated tendency for income distributions to collapse near the poverty line. In fact, for the poor, in these two schemes any income in excess of the poverty line tends to be largely absorbed by their comparatively high premium, implying that in the absence of solidarity at national level, subsidies from other sources would be needed to make the arrangement attractive for its participants.
21
Table 3 Estimated parametric index function ( α = 0 ) (constant: β0 = 593 , R 2 = 0.54 )
β
LGP Effect -1.054 Mean 255 Elasticity 2.1468
Cocoa -0.003 2955 0.0707
cassava -0.213 108 0.1835
Yam -0.156 338 0.4206
cocoyam -0.046 302 0.1108
maize -0.048 471 0.1803
tomatoes farmsize -0.016 -0.015 886 5280 0.1131 0.6318
It appears that the parameters of a purely parametric index all have the expected negative sign, indicating that a prolonged growing season, higher prices and an increased farm size all tend to lessen the need for indemnification. It also appears that the fit of the parametric index is fairly good (R2 of 0.54, which increases to 0.77 for the fit within the 50,000 cedi band allowed for through the hard margin). Here we report on R2 for ease of reference, noting, however, that this is not the risk criterion actually minimized and not even the least-absolute deviation criterion. As regards the magnitude of the effects, the elasticity estimates (Table 3, row 3) indicate that the indemnity payment is more responsive to weather than to price shocks. For example, a 10% variation in the LGP, say a decrease of one month in the growing period of perennials leads to an increase of about 25% in the indemnification. This corresponds to an increase of almost 50.000 cedi. With respect to prices, it appears that the index function is most responsive to the price of yam (elasticity of 0.4) followed by cassava and maize (around 0.2), cocoyam and tomatoes (0.1), while the price of cocoa affects the indemnification less. This agrees with the relatively high vulnerability of staple crop farmers in the northern parts of Ghana, as compared to the better economic situation of cassava and cocoa farmers in the southern parts. Similar results persist if we substitute the cocoa price with the highly correlated prices of other commercial crops, such as palm oil and tropical fruits. The poverty incidence under the parametric index function shows a limited capacity to address poverty. As indicated in Table 4 below, the simulated poverty incidence decreases from 44.1 to 36.9 per cent, a reduction by 7.2 per cent points only. Nonetheless, we remark that despite its modest performance, this parametric function is by virtue of its least absolute deviation, likely to perform much better than any synthetic specification based on covariance matrices. Table 4 Estimated Semi-Parametric index function, back-fitting procedure ( β fixed)
n
2
R
Poverty incidence (%)
Mean of
α
Std dev of
α
Mean of abs. errors
Regularization term
-
-
61.7
-
Uninsured case -
44.1
-
Parametric index
0.541
36.9
-
-
Semi-Parametric index λ( n ) 1
0.852
14.9
10.9
2878
8.4
0
2
0.852
17.9
11.7
1239
9.4
67
22
3
0.805
18.5
9.1
213
18.7
334
4
0.727
22.2
-2.1
68
30.6
678
5
0.667
29.1
43
38.7
1585
-2.4
Table 4 shows that important improvements of the index-based insurance can be obtained by opting for the semi-parametric specification. The ability of the non-parametric part to adapt better the index-based insurance to the needs for indemnity payments comes to the fore in the second and third column. Without regularization, at n = 1, the fit is very good (R2 of 0.852) but as expected, it gradually decreases with an increasing regularization, slowly approaching the fit of the purely parametric index function (R2 of 0.541). By the same token, as compared to the parametric index and at moderate levels of regularization, the poverty incidence is substantially lowered to around 22.2 per cent (colum 3). We also remark that the presumably overfitted index function at λ ( 1) = 0 can be given a particular interpretation. As mentioned in section 3.4, the results are indicative of the minimum level of farmers’ basis risk of any index function based on the selected weather and price variables and satisfying the self-financing constraint. In the prevailing case, this minimum is 58,000 cedi, comprising an assumed 50,000 induced by the η -margin plus an average absolute error of about 8,000 from the ideal indemnification payment (column 6), as compared to a mean absolute error of 62,000 for the parametric index. Reducing this η -margin further would definitely reduce the value further, but come up eventually against the limits of the spread in xvalues discussed in section 3. The table further shows how, under regularization, the mean and standard deviation of the parameters α decline significantly, reducing both the contribution and the bumpiness. At the same time, both the mean absolute error (first part of the objective) increases and the regularization term (second part of the objective) rise. Table 5 Estimated parametric part of regularized semi-parametric estimation (joint estimation of α , β ) (constant: β0 = 593 , R 2 = 0.73 )
β
LGP Cocoa cassava Yam cocoyam maize Effect -0.977 -0.003 -0.214 -0.154 -0.047 -0.050 Elasticity 1.99308 0.07092 0.184896 0.416416 0.113552 0.1884
tomatoes farmsize -0.017 -0.016 0.120496 0.67584
In table 5, we report on the estimated parameters of the semi-parametric estimate for α and β jointly, focusing on the best regularization factor obtained at a value ( λ( 4 )) that reduces the variability of α s (standard deviation=68) without causing a sharp decline in the fit ( R 2 = 0.73 ). The results indicate that for the regularization considered, the purely parametric index function produces parameters close to those obtained from the semi-parametric function. This is a good property that would indicate that the kernel terms of the support vectors of the non-parametric part are fairly orthogonal to the linear terms of the parametric part, and corroborates the use of the kernel term as an additional explanatory element, to model large errors better than under normality assumptions (or more precisely, under assumptions of zero median and skewness). Of course, this independence depends on the regularization and will be stronger at higher levels.
23
As a further illustration of the performance, under the various arrangements, Figure 3 compares the (kernel-smoothed) income distribution between the fully insured income (red, lowest line), the ideal indemnification (blue, with the peak), the fully parametric index (Table 3, black, second highest round top), and the semi-parametric index function (Table 5, green, highest round top).
0
.002
.004
.006
.008
.01
Figure 3: Insurance against poverty Uninsured income; ideal insurance; index-based insurance (parametric/ semi-parametric)
Poverty Line x Uninsured Param
Ideal Semi-Par
Some interesting aspects can be pointed to. First, comparing the uninsured case with the ideal insurance, it appears that the income distribution under a self-financed ideal national scheme collapses near the poverty line in line with a redistribution of income in favor of the poorest strata of the population. In this case, the income of only 13% of farm households exceeds the international threshold of 2 dollars per capita per day (1,750,000 cedi), while this percentage was twice as high (26%) before. Second, comparing the uninsured case with the two index-based ones, we also find a tendency for incomes to collapse on the poverty line but clearly much less than in the ideal case. The simulated effects on poverty are substantial, nonetheless. Poverty prevalence decreases (Table 4) but also the depth of poverty is reduced as well, as can be seen from the narrowing of the left side tails. Finally, the narrowing of the left side tail is much more pronounced for the semiparametric index, implying that it definitely out-performs the purely parametric index function and can be more effective in addressing the indemnification requirements of the most needy.
6.
Summary and conclusions
24
Our has studied to what extent indexed-based insurance, price-weather insurance in particular, can contribute to poverty reduction, looking for ways to reduce the gap between both strands of literature. A review of ongoing efforts to implement index-based insurance in developing countries revealed that participation rates are generally low and that self-financing is very modest. Leaving aside various explanations associated to institutional weaknesses such as corruption, we noted that this limited enthusiasm might also be due to the fact that the basis risk of the proposed indemnification schemes will tend to be relatively high as long as these remain based on functions that are formulated on a priori grounds, without much attention for the fit to the risk profiles of the target groups. To address this aspect, we have proposed and implemented an approach to develop an index-based contract that adapts optimally to these profiles, while focusing on avoidance of catastrophic risk, defined as a fall below the poverty line. Besides this adaptation to the risk profile, the contract ensures self-financing for a given level of subsidy and under assumed pooling of individual farmers within specified groups: quintiles, regions and the nation. Next, we have considered a set of weather and price variables and constructed an index function that is as close as possible to the ideal indemnification scheme by minimizing the basis risk that is associated to a particular function. For this purpose, we adapted a technique from SVregression while allowing for financing restrictions. We have seen that the approach inherits the flexibility and the convergence properties of SV-regression. In addition, the absolute value criterion of our formulation enabled us to optimize in a fully integrated way from the perspective of the risk minimizing farmer, hence avoiding the traditional dichotomy between formulation of and simulation with a micro-model with an income risk minimizing farmers on the one hand and econometric estimation with a likelihood risk on the other. The numerical application has been conducted in a dedicated, fully GAMS-controlled software package that can flexibly accommodate bagging as well as various extensions of the constraint set beyond self-financing. At the practical level, we ran the estimation on a sample of 100 representative Ghanaian households over 26 possible states of the world drawn from historically observed conditions. The parametric part of the SV-regression showed the expected signs, with indemnification requirements reacting with a negative sign to higher rainfall and higher prices, Specifically, among these variables, rainfall is the best predictor, followed by the price of staple crops, which might reflect the relatively high vulnerability of staple crop farmers in the Northern part of Ghana as compared to the better-off situation of cash-crop farmers in the Southern part. Our comparison of parametric with semi-parametric index functions showed that the nonparametric part improves the capacity to adapt to the required indemnification payments. We also studied how, under increased regularization, the fit of the semi-parametric index function approaches the purely parametric one, reducing the fit but also the bumpiness of the prediction The simulated effects on poverty were substantial. Poverty prevalence was reduced on average but the depth of poverty decreased as well, especially for the semi-parametric index at a moderate level of regularization. Regarding work in progress, we intend to assess the robustness of our estimates by expanding the constraint set with solvency constraints and by applying bagging through a series of sub-samples in program (3.14) implemented with a broader window size that explores more interdependencies in the data.
25
References Barnett, B.J., C.B. Barrett and J.R. Skees (2006) ‘Poverty traps and index-based risk transfer products’. Mimeo. Beaudry, P. and D.A. Green (2000) ‘Cohort patterns in Canadian earnings: assessing the role of skill premia in inequality trends’. Canadian Journal of Economics 33(4), 907-36 Chapagain, A.Y. and A.K. Hoekstra, (2004) Water Footprints of Nations. UN/WWAP, UN World Water Development Report. New York: United Nations Deaton, A.S. (1985) ‘Panel data from time series of cross-sections’. Journal of Econometrics 30,109-26. Deaton, A.S. (1997) The Analysis of Household Surveys: A Microeconometric Approach to Development Policy. Baltimore: John Hopkins University Press. Decaluve, B., L. Savard and E. Thorbecke (2001) ‘General Equilibrium approach for poverty analysis’ Mimeo Dercon, S. (1996) ‘Risk, crop choice, and savings: Evidence from Tanzania’. Economic Development and Cultural Change 44, 485-513. Dercon, S. ed. (2005) Insurance Against Poverty. Oxford: Oxford University Press. Ermoliev, Y.M., T.Y. Ermolieva, G.J. MacDonald and V.I. Norkin (2000) ‘Stochastic optimization of insurance portfolios for managing exposure to catastrophic risks’. Annals of Operations Research 99, 207-25. Fafchamps, M. (1992) ‘Cash crop production, food price volatility, and the rural market integration in the Third World’. American Journal of Agricultural Economics 74, 90-9. FAO (1979) Yield response to water. Drainage and Irrigation Paper 33. Rome: Food and Agriculture Organization of the United Nations. FAO (2005) Fertilizer use by crop in Ghana : Rome: Food and Agriculture Organization of the United Nations. Gilonia, A., J.S. Simonoff and B. Senguptac (2006) ‘Robust weighted LAD regression’. Computational Statistics and Data Analysis 50, 3124-40. GMI (2006) Monthly Rainfall in 72 Synoptic Rainfall Stations. Accra: Ghana Meteorological Institute. GSS (1987) Ghana Population Census 1984. Accra: Ghana Statistical Service. GSS (1989) Ghana Living Standards Survey: First Year Report, September 1987 - August 1988. Accra: Ghana Statistical Service. GSS (1995) Ghana Living Standards Survey: Report on the Second Round, October 1988 September 1989. Accra: Ghana Statistical Service. GSS (1995) Ghana Living Standards Survey: Report on the Third Round, September 1991 September 1992. Accra: Ghana Statistical Service. GSS (2000) Ghana Living Standards Survey: Report on the Fourth Round, April 1998 - March 1999. Accra: Ghana Statistical Service. GSS (2001) Ghana Population and Housing Census 2000. Accra: Ghana Statistical Service. GSS (2005) Quarterly Digest of Statistics. Accra: Ghana Statistical Service. Goodwin, B. K. (2001) ‘Problems with market insurance in agriculture’. American Journal of Agricultural Economics 83, 643-9. Goodwin, B. K. and O. Mahul (2004) ‘Risk modeling concepts relating to the design and rating of agricultural insurance contracts’. Policy Research Working Paper 3392 Washington D.C.: World Bank. Hess, U., and J. Syroka (2005) ‘Weather-based insurance in Southern Africa: the Malawi case study’. World Bank Agricultural and Rural Development Discussion Paper 13, Washington DC: World Bank.
26
IMF, (2005) International Financial Statistics Yearbook Washington DC: International Monetary Fund. Herbrich, R. (2002) Learning kernel classifiers. Theory and Algorithms. Cambridge Ma: MIT Press. Huber, P.J. (1981). Robust Statistics. New York: Wiley. Kalavkonda V. and O. Mahul (2005) ‘Crop insurance in Karnataka’. World Bank Policy Research Working Paper 3654. Washington D.C.: World Bank. Keyzer M.A. (2005) Rule based and support vector regression and classification algorithms for joint processing of census, map survey and district data. SOW-VU Working Paper 05-01. Centre for World Food Studies, Vrije Universiteit Amsterdam. Kimeldorf, G.S. and G. Wahba (1970) ‘A correspondence between Bayesian estimation on stochastic processes and smoothing by splines’. Annals of Mathematical Statistics 45, 495502. Koenker, R., and Bassett, G. W. (1978) ‘Regression Quantiles’, Econometrica, 46: 33–50. Mahul O. and B.D. Wright (2003) ‘Designing optimal crop revenue insurance’. American Journal of Agricultural Economics 85(3), 580-9. Norkin and Keyzer (2007) ‘On convergence and stability of kernel learning quantile estimators’, Glushkov Institute of Cybernetics, Kiev, and Centre for World Food Studies, Amsterdam. Mimeo. Rosenzweig, M.R. (1988) ‘Risk, implicit contracts and the family in rural areas of low income countries’. Economic Journal 98, 1148-70. Schoelkopf, B. and A.J. Smola (2002) Learning with Kernels. Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge Ma: MIT Press. Skees, J.R, S. Gober, P. Varangis, R. Lester and V. Kalavakonda (2001) ‘Developing rainfall-based index insurance in Morocco’. Policy Research Working Paper 2577. Washington D.C.: World Bank. Skees, J.R., P. Hazell and M. Miranda (1999) ‘New approaches to crop yield insurance in developing countries’. EPDT Discussion Paper 55. Washington D.C.:International Food Policy Research Institute. Skees, J., P. Varangis, D.F. Larson and P. Siegel (2005) ‘Can financial markets be tapped to help poor people cope with weather risks?’ In: S. Dercon ed. Insurance Against Poverty. Takeuchi I., Q.V. Le, T. Sears, and A.J. Smola (2005) ‘Nonparametric Quantile Estimation’, Journal of Machine Learning Research, 7: 1001-1033 Teal F., M. Vigneri (2004) ‘Production changes in Ghana cocoa farming households under market reforms’ Mimeo Vapnik, V.N. (1998) Statistical Learning Theory. New York: Wiley. Veeramani, V.N., L.J. Maynard and J.R. Skees (2005) ‘Assessment of the risk management potential of a rainfall based insurance index and rainfall options in Andhra Pradesh, India’. Indian Journal of Economics & Business 4(1), 195-208. Zant W. (2005) ‘Is index insurance useful for cash crop growers? Simulating production and price index insurance for Indian smallholder pepper growers’. Mimeo