Gianni BETTI - World Bank Group

2 downloads 0 Views 2MB Size Report
Dec 14, 2010 - Pogradec. Mirdite. Puke. Malesi e Madhe. Mat. Kukes. Has. Tropoje. Bulqize. Diber. Gramsh. Librazhd. Tirana urban. Tirana other urban. Major.
FURTHER UPDATING POVERTY MAPPING IN ALBANIA

GIANNI BETTI*, ANDREW DABALEN**, CELINE FERRÈ** AND LAURA NERI* * University of Siena, Italy, ** The World Bank, Washington, USA Correspondent Author: Gianni Betti, [email protected]

Paper prepared for presentation at the World Bank International Conference on Poverty and Social Inclusion in the Western Balkans WBalkans 2010 Brussels, Belgium, December 14-15, 2010

Copyright 2010 by author(s). All rights reserved. Readers may make verbatim copies of this document for non-commercial purposes by any means, provided that this copyright notice appears on all such copies.

FURTHER UPDATING POVERTY MAPPING IN ALBANIA

GIANNI BETTI*, ANDREW DABALEN**, CELINE FERRÈ** AND LAURA NERI* * University of Siena, Italy, ** The World Bank, Washington, USA Correspondent Author: Gianni Betti, [email protected]

Abstract This paper aims at further updating the poverty mapping in Albania, after the first work of Betti et al. (2003), who calculated the poverty and inequality mapping referred to the year 2002, and the first updating work of Dabalen and Ferrè (2008), who updated the poverty mapping to the year 2005. The aim consist in creating reliable and relevant statistical information so to assist policy-makers in the design, implementation and evaluation of economic, social and environmental programs in Albania, further updating poverty mapping for the year 2008. Key words: Poverty mapping, counterfactual distribution, LSMS survey, census.

2

1. Introduction

The World Bank is assisting the Government of Albania in the establishment of a permanent poverty monitoring and policy evaluation system in Albania. This paper aims at further updating the poverty mapping in Albania, after the first work of Betti et al. (2003), who calculated the poverty and inequality mapping referred to the year 2002, and the first updating work of Dabalen and Ferrè (2008), who updated the poverty mapping to the year 2005. The aim consist in creating reliable and relevant statistical information so to assist policy-makers in the design, implementation and evaluation of economic, social and environmental programs in Albania, further updating poverty mapping for the year 2008. The first poverty mapping exercise, conducted by Betti et al. (2003), was carried out based on the methodology fully described in Elbers, Lanjouw and Lanjouw (2003). This methodology combines census and survey information to produce finely disaggregated maps which describe the spatial distribution of poverty and inequality in the country. In fact, in order to produce poverty and inequality maps, large data sets are required which include reasonable measures of income or consumption expenditure and which are representative or of sufficient size at low levels of aggregation to yield statistically reliable estimates. Household budget surveys or Living Standard Measurement surveys covering income and consumption usually used to calculate distributional measures are rarely of such a sufficient size. Whereas census or other large sample surveys sufficiently large enough to allow disaggregation have little or no information regarding monetary variables. The basic idea is to estimate a linear regression model with local variance components using the information from the smaller and richer data sample, in the Albanian Living Standard Measurement Study (LSMS) conducted in 2002, including some aggregate information from the Population and Housing Census or other sources available for all the statistical units in the sample (i.e. from the General Census of Agricultural Holdings). The vector of covariates utilised in the regression model was restricted to those variables that can also be linked to households in the census. The estimated distribution of the dependent variable in the regression model (monetary variable) can therefore be used to generate the distribution for any sub-population in the census conditional to the 3

sub-population’s observed characteristics. From the estimated distribution of the monetary variable in the census data set or in any of its sub-populations, an estimate was made of a set of poverty measures based on the Foster-Green-Thorbecke indexes (for !=0,1,2), the Sen index and an absolute poverty line calculated using the information contained in the rich sample survey, as well as a set of inequality measures based on the Gini coefficient, the Gini coefficient of the poor and two general entropy (GE) measures, with parameter c=0,1. Moreover, bootstrapping standard errors of the welfare estimates were computed so as to assess the precision of the estimates. Later, Dabalen and Ferrè (2008) updated the first poverty mapping by means the new round of the Living Standard Measurement Survey – LSMS 2005. The contribution of such work relied on a reweighting scheme. Specifically, the proposed method, which took initiative from Lemieux (2002), constructs a counterfactual consumption distribution for the old household survey, in our case the Albania LSMS 2002, using the information contained in the latest survey, Albania LSMS 2005. The method projects what the consumption distribution for the 2002 LSMS (which was conducted about the time the census was done) would look like if the parameters (the coefficients of a consumption model) and distribution of the characteristics of the sample, and therefore population, were as reported in the 2005 survey. The derived counterfactual distribution together with the census, were then used to obtain an updated poverty map of the country, using the methodology proposed by Elbers et al. (2003) and used in the original poverty and inequality mapping by Betti et al. (2003). In this report we aim at further updating the poverty mapping in Albania by using the procedure proposed by Dabalen and Ferrè (2008). We construct a counterfactual distribution of the monetary variable for the LSMS 2002, using information from the latest survey, the Albania LSMS 2008. We then apply the original proposal of Elbers et al. (2003) on the basis of the counterfactual distribution, which leads to a monetary distribution in the census data related to the 2008 year. This report is made up of six sections and two annexes. After this introduction, section two is devoted to the comparison and the harmonisation of the data sources, giving special attention to the Census and LSMS data sets. Section three is devoted to the description of the methodology proposed by Dabalen and Ferrè (2008) for constructing the counterfactual distribution in the LSMS 2002 survey. In section four the estimated linear regression models with variance components are reported and there is a full description of how the Montecarlo simulation has been considered in order to prepare the statistical

4

information for calculating bootstrapping standard errors of poverty measures. Section five reports the above described indices calculated for the whole of Albania and disaggregated at five levels:

a) the four strata used in sampling the LSMS; b) the six strata for which we have estimated the linear regression models; c) the 12 Prefectures; d) the 36 Districts; e) the 374 Communes/Municipalities; Section six reports conclusions and recommendations. Annex one reports poverty measures at Communes and Municipalities level, while Annex two fully describes the comparison made between the various data sources and the list of common variables.

2. The sources

The Republic of Albania is geographically divided into 12 Prefectures. These are divided into Districts which, in turn, are divided into Municipalities and Communes. The Communes contain all the rural villages and the very small cities. The Capital of Albania, Tirana, is also divided into 11 Minimunicipalities. The two main sources of statistical information available in Albania are: the Population and Housing Census (PHC) – 2001. the Living Standard Measurement Study (LSMS) – 2002 – 2005 - 2008.

2.1 The Population and Housing Census 1 The census was conducted in April 2001, and the moment as reference was considered midnight of 31 March 2001. The 2001 census introduced some essentially new concepts in data collection methods as well as in definitions, mainly the concept of an open population was introduced in order to asses the consequences of emigration and internal migration. For the April 2001 General Census of Population and for Housing census purposes, the cities and the villages have been divided into 9,834 Enumeration Areas (EAs) which were established throughout

1

INSTAT (2002), The Population of Albania in 2001. 5

the country and generally involved about 80-120 dwellings. The fieldwork of the census was based on a four-part questionnaire with questions at four different levels:

a) Building questionnaire: to be completed only for the first or only dwelling in the building. b) Dwelling questionnaire: to be completed for all the inhabited dwellings in the building. c) Household questionnaire: to be completed for all the households in the dwelling. d) Individual questionnaire: to be completed for all the members of the household who are present, or absent for less than 1 year (to be defined in the roster). At the end of march 2001 in Albania there were 726,895 households with 3,069,275 persons (1,347,281 in the labour force) living in 512,387 buildings.

2.2 The Living Standard Measurement Study (LSMS) – 2002-2005-2008 2. The 2002 LSMS was carried out between April and June, with some field activities extending into August and September. The survey work was undertaken by the Living Standards unit of INSTAT (Albanian National Statistics Office), with the technical assistance of the World Bank. The Population and Housing Census (PHC) performed in mid-2001, provided the country with a much needed updated sampling frame which is one of the building blocks for the household survey structure. In fact the 9,834 Enumeration Areas formed the basis for the LSMS sampling frame. The final sample design for the 2002 LSMS included 450 PSUs and 8 households in each PSU, for a total of 3600 households. Four reserve units were selected in each sample PSU to act as replacement unit in non-response cases. In a few cases in which the rate of migration was particularly high and more than four of the originally selected households could not be found for the interview, additional households for the same PSU were randomly selected. The sampling frame was divided into four regions (strata): Coastal Area, Central Area, Mountain Area, and Tirana (urban and other urban). These four strata represent the domains of estimation. They were further divided into major cities, other urban, and other rural (Table 1). The EAs were allocated proportionately to the number of housing units in these areas. The same sampling scheme has been later adopted for the LSMS 2005 and the LSMS 2008.

2

The World Bank (2002), Basic Information Document, Living Standard Measurement Study, Albania, Development Research Group. INSTAT (2009), Albania: Trends in Poverty 2002-2005-2008. 6

Table 1: Domains of Estimation (Regions) Districts and Major Cities in the Domains of Estimation Region 1 Coastal area Districts ( Other Urban )

Major Cities

Lezhë Kurbin Kavajë Mallakaster Lushnje Delvine Sarande Durres Fier Vlore

Region 2 Central Area Kuçove Skrapar Krujë Peqin Gjirokastër Permet Tepelenë Tirana (rural) Shkoder Elbasan

Devoll Kolonjë Pogradec Mirdite Puke Malesi e Madhe Mat

Region 3 Mountain Area Kukes Has Tropoje Bulqize Diber Gramsh Librazhd

Tirana Tirana urban Tirana other urban

Berat Korçë

Four survey instruments were used to collect information for the 2002, 2005 and 2008 Albania LSMS surveys:

a) Household questionnaire b) Diary for recording household food consumption c) Community questionnaire d) Price questionnaire.

2.3 Stage Zero: are the Census and the LSMS 2002 comparable? The two sources of data (Census and LSMS 2002) have been fully analysed by Betti et al. (2003) in order to identify the common concept and to construct the common variable to be compared. The original Census and LSMS 2002 variables have been transformed in order to get comparable variables. The set of those common variables have been divided into three categories:

a) household dwelling conditions and presence of durable goods (23 variables); b) household head characteristics (8 variables); c) household socio-demographic characteristics (7 variables). Since some variables collected in the LSMS 2002 survey presented some missing values it was decided to impute them in order to avoid the loss of statistical units (and therefore degrees of freedom) in the estimation of the linear regression model with variance components. The imputation procedure was based on the “sequential regression multivariate imputation” (SRMI) approach adopted by the imputation software (IVE-ware), and is fully described in Betti et al. (2003). The variables which underwent the imputation procedure were: i) type of building; ii) inhabited dwelling surface. 7

Each of the 38 variable distributions from the Census were compared with the corresponding distribution from the LSMS 2002, with the weighted distribution from the LSMS 2002 and, for the above three variables, with the imputed LSMS 2002 distribution. A chi-square test was used for the comparisons. The main decision to be taken lies in the choice of the potential variables to be included in the regression model as explanatory variables. According to the chi-square test, only 9 out of 38 LSMS 2002 distributions fit the census counterpart very well. This leads to a trade-off between the use of many explanatory variables (not highly comparable with those in the census) and the use of few explanatory variables (loosing part of the explanation of the variability in the dependent variable in the model). To overcome this problem it was decided to reduce the number of categories of most of the variables in order to obtain new distributions (mainly dummies) which were similar, as far as possible, to those in the census.

3. The estimation of the counterfactual distribution

The proposal of Dabalen and Ferrè (2008) for updating the poverty mapping starts from the hypothesis that there is only one census and two household surveys, a new and an old household survey. In this Report the household surveys refer to the LSMS 2008 and the LSMS 2002. When the old household survey is conducted about the same time as the census, the Elbers et al. (2003) method can be applied and welfare estimates obtained for that year. The problem is how to update the poverty estimates for small areas when there is a new survey, but no new census, especially when the new survey may contain either very few observations or none at all in most of the small areas. In this case, Dabalen and Ferrè (2008) propose to construct a counterfactual consumption distribution of the old household survey, using information from both the old and new household survey and match the corresponding estimates with the old census data, following the methodology proposed by Lemieux (2002). Here, using the notation of Dabalen and Ferrè (2008), we show how to construct the counterfactual distribution for the LSMS 2002 data, which will permit us to link the Census data with the new LSMS 2008 survey. Let us consider a consumption model using the new LSMS 2008 survey: [1]

8

Where

denotes consumption in year 2008,

indexes the household,

captures the returns to or price of covariates in 2008),

is a parameter (that

is a vector of covariates, which are in

common between the LSMS 2002 and the LSMS 2008 surveys, and

is unobserved component of

consumption. Dabalen and Ferrè (2008) note that using the LSMS 2008 survey, without additional adjustment, and applying the Elbers et al. (2003) estimator would be problematic because the returns to covariates, the parameter

may have changed between 2002 and 2008. In addition, the profile of the population –

that is covariates such as education levels, age composition, and so on – may also have changed. Finally, the returns to unobserved covariates may also have changed. To recreate a consumption distribution that resembles consumption of 2002, we would have to account for these changes. Therefore, the construction of the counterfactual consumption distribution proposed by Dabalen and Ferrè (2008) is based on three basic steps (which should note be confused with the Stage zero, Stage one and Stage two of the Elbers et al. (2003) methodology). The first step of Dabalen and Ferrè (2008) consists in creating a consumption distribution that would have prevailed in 2002 if the parameters were as in 2008. This can be seen as: [2] Equation [2] accounts for changes in the parameters of covariates, by using the estimated parameters from the LSMS 2008 survey to estimate consumption distribution in the LSMS 2002 survey. However, in addition to these parameters, levels of covariates may have changed because as explained by Dabalen and Ferrè (2008) the population is 2008 much more educated compared to 2002. When the covariates of interest are small, say only education variable that takes only two values - primary and higher education - then a simple re-weighting of each cell would be sufficient. But when changes in multiple covariates are of interest, as they are in our case, it is not feasible to do cell-by-cell reweighting. Instead, Dabalen and Ferrè (2008) propose to create a score that reduces the dimension of the data, by stacking the LSMS 2008 and LSMS 2002 surveys, and then running a probit model of the form:

9

[3] In principle, a large set of observable household level characteristics can be included, migration status of the household,

, and also the

, or any suitable variables that capture the scale of migration,

which is of crucial concern when trying to update poverty maps. Equation [3] allows us to obtain a propensity score – the predicted probability of being in period

– conditional on the

observable characteristics.

[4] Where

is the unconditional probability that an observation belongs to period

or the share of year

2008 observations in total observations (that is, both years). In this framework, accounting for changes in the distribution of observable characteristics is equivalent to reweighing the consumption distribution estimated in equation [3], as proposed by Dabalen and Ferrè (2008), which get: [5] The only step remaining in the proposal of Dabalen and Ferrè (2008), is to add a measure of the unobserved component of consumption. If the dispersion in unobserved consumption is due to random events that are unrelated to systematic differences across households, then there would be nothing more to say about the error term. However, one reason to add a measure of the unobserved consumption is that the residual is unlikely to be just a random component of consumption. Instead, it may reflect systematic, albeit unexplained, differences between households. Therefore, for these two reasons, Dabalen and Ferrè (2008) adjust the consumption in equation [4] with counterfactual residuals. They first estimate a consumption model for the 2002 data, and rank all the households on the basis of the residual distribution for that year. Then they assign to each household in year 2002, the value of ranked residual from the empirical distribution of residuals in the new survey (year 2005 in their work, year 2008 in this further update) that corresponds to the year 2002 rank. In this way, we get the counterfactual consumption, the consumption that would have been observed in 2002, if the parameters, the distribution of covariates and the unmeasured determinants of consumption are as in

10

2008. From equations [1] and [5], the counterfactual wealth distribution can be rewritten as: [6] Where,

denote the value of the ranked residual in 2008 assigned to a household with the same

residual rank in year 2002. Dabalen and Ferrè (2008) note that in practice, in addition to the reweighing factor, sampling weights, weighting factor,

, can be easily introduced so that we end up with a modified

.

4. Stage One: the estimation of stratum-specific linear regression models with variance components for imputing expenditures

The basic idea of the Stage one of Elbers et al. (2003) can be explained in a simple way. Having data from a smaller and a richer data-sample such as a sample survey (counterfactual distribution in LSMS 2002) and a census, a regression model of the target household-level variable, given a set of covariates based on the smaller sample can be estimated. Restricting the set of covariates to those that can also be linked to households in the larger sample, the estimated distribution can be used to generate the distribution of the consumption expenditure (yh) for the population or sub-population in the larger sample given the observed characteristics. Therefore the conditional distribution of a set of welfare measures can now be generated and the relative point estimates and standard errors can be calculated. Practically the methodology follows two steps:

a) the survey data are used to estimate a prediction model for the consumption b) simulation of

the expenditure for each household of the census and poverty/inequality

measures are derived with their relative prediction error. In the context of this work the smaller sample survey is the LSMS (2008, counterfactual 2002) survey and the larger one is the Census (2001). The key assumption is that the model estimated from the survey data apply to census observation, of course the assumption is most reasonable if the survey and census year is the same, unfortunately it is not our case, so when interpreting results we need to consider that the poverty estimates obtained refer to the census year.

11

4.1 A prediction model for consumption

This step (Stage one) consists in developing an accurate empirical model of a logarithmic transformation of the counterfactual household per-capita total consumption expenditure (LSMS 2002 variable logcon_cf), where rent and health expenditure are excluded. Geographical differences in the level of prices are taken into account. In the model the covariates are variables defined in exactly the same way as in the smaller sample data (LSMS 2002) and in the Census, as explained in the section on the Stage Zero. Denoting by

the

logarithm of the counterfactual consumption expenditure of household h in cluster c, a linear approximation to the conditional distribution of

is considered: [7]

Previous experience with survey analysis3 suggests that the proper model to be specified has a complex error structure, in order to allow for a within-cluster correlation in the disturbances as well as heteroschedasticity. To allow for a within cluster correlation in disturbances, the error component is specified as follows: [8] where

and

are independent of each other and not correlated to the matrix of explanatory

variables. Since residual location effects can highly reduce the precision of welfare measure estimates, it is important to introduce some explanatory variables in the set of covariates which explain the variation in consumption expenditure due to location. For this reason introducing the means of each covariate into the model covariates is proposed. This is calculated over all the census households in the 450 census enumeration areas (EA) which correspond to the 450 PSU selected in the LSMS 2002 sampling scheme. The enumeration areas correspond to clusters in the LSMS 2002 data. As in the first poverty mapping of Betti et al. (2003) and in the first updating of Dabalen and Ferrè (2008), some preliminary analyses on the counterfactual distribution suggest that the expenditure 3

Elbers, Lanjouw and Lanjouw (2003). 12

behaviour is locally different so, in order to avoid forcing the parameter estimates to be the same for the whole country it has been decided to estimate separate regression models for the following areas:

o coastal area –rural- (stratum 1-rural), o coastal area –urban- (stratum 1-urban), o central area –rural- (stratum 2-rural), o central area –urban- (stratum 2-urban), o mountain area (stratum 3), o Tirana (stratum 4).

The final results of this first stage are the GLS estimates of the selected model estimated on the LSMS 2002 counterfactual distribution. The initial estimate of β in equation [7] is obtained from OLS (weighted with survey sampling weights), the proportion of deviance explained by the model (see R2OLS of Table 2) is 0.52 for the coastal area –urban- (stratum 1-urban) and ranges between 0.68 and 0.98 for the other strata.. With consistent estimate of β, the residual from the regression are used as estimates of the overall disturbances

. This residual is decomposed into uncorrelated household

and location components as follows: The estimated location components ( household component estimates ( are used to estimate the variance of The variance of

(

) are the within cluster means of the overall residual. The

) are the overall residual net of location components, these values .

) is estimated non-parametrically, the variance of

is estimated allowing for

heteroschedasticity (see Appendix 2 of Elbers, Lanjouw and Lanjouw, 2002). The two variance components are combined in order to calculate the estimated variance covariance matrix ( overall residual of the original model. Once

) of the

is calculated the original model can be estimated by

GLS, the results are in Table 3.

13

The estimated share of the location component with respect to the total residual variability is

represented by Rho=

(see Table 2) and it is actually negligible for all the strata, ranging between

0.018822 and 0.06756. The idea of estimating different models for each stratum or sub-stratum seems to be proper either in terms of local effect and in terms of covariates, in fact different subsets of covariates are significant for each model. The significant parameter for each stratum/sub-stratum is the possession of a car. The other significant parameters in almost all the strata/sub-strata are the household size (as logarithmic transformation), the level of education and the number of nonworking people in the household. With regards to the possession of durable goods the most important is a refrigerator, followed by a TV and heater. Considering the EA mean variables it can be observed that the variables relating to the migration before 1990 and having a separate kitchen are significant in three of the six strata. The results from this first step consist of a set of estimated GLS parameters for the regression coefficient

, the associated variance covariance matrix and the disturbance at cluster and household

level. As for the disturbances attention is focused on their distribution; first of all some tests of normality has been carried out (Shapiro-Wilk, Kolmogorov Smirnov and Cramer von- Mises). We conclude that the hypothesis of normality is always rejected, only for the household residuals of the Central Urban Area we fail to reject the null hypothesis.

14

Table 2: Regression results by Strata: GLS estimates and standard errors (in parentheses) Coastal area (urban) 480

Number of groups

65

60

65

60

125

75

R-squared overall

0.76

0.52

0.81

0.68

0.96

0.98

ηc

0.01828

0.01842

0.01015

0.01536

0.004195

0.000315

εci

10.3810

37.8646

10.4065

17.3149

0.7991

0.8560

-2 log likelihood

0.040273 14.7

0.02158 618.3

0.030285 -268.0

0.028923 223.1

0.06756 -1241.3

0.018822 -1248.3

AIC

18.7

622.3

-264.0

227.1

-1237.3

-1244.3

AICC

18.8

622.4

-264.0

227.1

-1237.3

-1244.3

BIC

23.1

626.5

-259.0

231.3

-1231.7

-1239.6

Observations

Sigma2 Sigma2 Rho

# dwellings in the building 2-15 # dwellings in the building 16 or more House construction before 1945 House construction 1945 - 1960 House construction 1981 - 1990 House construction after 1990 House owner

Central area Central area (rural) (urban) 520 479

Mountain area 1000

Tirana

Coastal area (rural) 520

0.08749*** (0.007916) 0.08111 (0.05358) -0.1427*** (0.01416) 0.07338 (0.06429) -0.4761*** (0.05296)

-0.00866 (0.02045) -0.0947*** (0.02036)

0.01798** (0.008414) -0.1034*** (0.03904) 0.01038 (0.01567)

House inhabited surface less than 40m2 House inhabited surface more than 70m2 Plastered

-0.2055*** (0.06166) 0.1351*** (0.02883)

Material: Brick or stone

0.03392*** (0.007109) 0.08656*** (0.007782) 0.08382*** (0.01149)

Lift

0.1108*** (0.01674)

Separate kitchen

0.1139*** (0.02604) 0.1457*** (0.01944)

Wc inside Water inside

0.1465*** (0.05000)

0.05909*** (0.01020)

0.1820*** (0.05834)

TV Parabolic

-0.03333** (0.01342) 0.1623*** (0.009566) 0.1429*** (0.009335)

0.006377 (0.04755)

Refrigerator Heater

0.04995** (0.02344)

0.06916*** (0.01798)

0.1221*** (0.03043)

Air conditioning Computer Car

-0.7385*** (0.04254) 0.2093*** (0.03183)

0.2392*** (0.05696)

0.3013*** (0.03152)

Washing machine Rooms per person

600

0.05616* 0.02968)

0.2183*** (0.05041)

0.1618*** (0.01634)

0.05715** (0.02278) -0.02181 (0.01703) 0.1118*** (0.01256) -0.7278*** (0.01245) 0.2699*** (0.01097)

0.1695*** (0.04002) 0.006790 (0.01116) 0.1281*** (0.01338)

15

Possession of agricultural land

-0.03413 (0.04976)

To be continued… Child 0-5 Child 6-15 Household size Household size squared

-0.02281 (0.01808) -0.03598*** (0.01386) -0.2309*** (0.02914) 0.01319*** (0.002621)

-0.05608** (0.02353) -0.01255 (0.02650) -0.1348*** (0.01833)

-0.0395*** (0.009052) -0.0993*** (0.005548)

0.2546*** (0.04254)

0.1084*** (0.01967) 0.5628*** (0.02347)

-0.2254*** (0.02879) 0.005991* (0.003151)

Highest education low Highest education medium Highest education high Migration before 1990

0.1063*** (0.02204) 0.8180*** (0.03209)

0.5407*** (0.02989)

-0.04038*** (0.004906) -0.0456*** (0.004182) -0.1189*** (0.008183) 0.01173*** (0.000608) 0.02465** (0.009703) 0.1968*** (0.009366) 0.2302*** (0.04713) 1.3024*** (0.01235)

-0.3671*** (0.007411) 0.02219*** (0.000819)

0.06702*** (0.007145) 0.9011*** (0.007254)

Migration since 1990 # non working people Household Head age Household Head female Household Head working

0.003167*** (0.000907) 0.05376* (0.03110) 0.08752*** (0.02387)

Spouse work

-0.03143* (0.01604) 0.03393*** (0.008177)

0.1006*** (0.01929) 0.07549* (0.04513)

-0.003*** (0.000255) -0.0928*** (0.009072)

0.1109*** (0.03039)

Spouse age

0.00067*** (0.000252)

Enumeration Area means variable House construction before 1945 House construction 1961 - 1980 House construction 1981 - 1990 House construction after 1990 House inhabited surface less than 40m2 Material: Brick or stone

0.5284*** (0.1124)

-0.0440*** (0.01194) 0.1872*** (0.03479) -0.2028 (0.1430)

Separate kitchen

0.2104*** (0.01998) -0.4318*** (0.03211) 0.2928*** (0.03670)

Wc inside Water inside Rooms per person Rooms business

0.3006*** (0.08432) -0.3923*** (0.1413)

Household size

0.1247*** (0.03867)

Highest education high Migration before 1990 Constant

-0.4917*** (0.1762)

0.6260*** (0.1071)

0.5337*** (0.03695)

9.7907*** (0.1605)

8.4377*** (0.03658)

9.8162*** (0.02828)

-0.2963 (0.1870) 9.3687*** (1.005)

9.4589*** (0.1149)

8.2843*** (0.2163)

16

4.2 Simulation on expenditure, poverty/inequality indicators and relative standard error The parameter estimates obtained from the previous step are applied to the census data so as to simulate the expenditure for each household in the census. A set of 100 simulation has been conducted. For each simulation a set of the first stage parameters has been drawn from their corresponding distribution simulated at the first stage: the beta coefficients multivariate normal distribution with mean

(the coefficients of the GLS estimation) and variance

covariance matrix equal to the one associated to and

, are drawn from a

. Relating the simulation of the residual terms

any specific distributional form assumption has been avoided by drawing directly from the

estimated residuals: for each cluster the residual drawn is

and for each household

The simulated values are based on both the predicted logarithm of expenditure disturbance terms

and

. , and on the

using bootstrapped methods: [9]

The full set of simulated

is used to calculate the expected value of each of the poverty measures

considered. For each of the simulated consumption expenditure distributions a set of poverty and inequality measures has been calculated as has their mean and standard deviation over all the 100 simulations. 5. Poverty measures The procedure for estimating the poverty measures for the year 2008 has been applied for the whole of Albania and disaggregated at six levels:

a) b) c) d) e) f)

Rural – urban level; The four strata used in sampling the LSMS; The six strata for which the linear regression models have been estimated; The 12 Prefectures; The 36 Districts; The 374 Communes/Municipalities;

For any given location, the means constitute the point estimates, while the standard deviations are the bootstrapping standard errors of these estimates. Table 3 reports poverty measures and their bootstrapping errors for the whole of Albania, and disaggregated at rural – urban level, by four strata and by rural/urban type for the Coastal and Central

17

regions (Stratum 1 and 2). The measures refer to the year 2008 and constitute the further updating after the year 2002 (Betti et al., 2003) and the year 2005 (Dabalen and Ferrè, 2008). The disaggregation into four strata is very useful for comparing these results to those obtained directly by LSMS for the years 2002, 2005 and 2008 and reported in Table 4 (Source: INSTAT, 2009). The census-based predictions are quite consistent with those from LSMS 2008: with the only exception of the Head Count ratio in Stratum 2. According to both sources, Stratum 4 (Region of Tirana) is better off in terms of percentage of individuals below the poverty line (head count, FGT(0)), while in the Stratum 3 (Mountain area) there seems to be the highest proportion of poor individuals. Table 3: Poverty indices (%), Poverty mapping 2008. FGT0_est FGT0_se FGT1_est

FGT1_se

FGT2_est

FGT2_se

ALBANIA URBAN

16.58% 13.06%

0.98% 1.05%

2.98% 2.92%

0.23% 0.29%

0.89% 1.08%

0.08% 0.12%

RURAL

18.91%

1.23%

3.01%

0.26%

0.76%

0.08%

STRATUM 1

12.65%

1.01%

2.59%

0.24%

0.89%

0.10%

STRATUM 2

16.50%

1.63%

2.98%

0.38%

0.91%

0.13%

STRATUM 3

35.73%

2.84%

5.89%

0.73%

1.46%

0.23%

STRATUM 4

8.47%

0.46%

1.08%

0.07%

0.24%

0.02%

Stratum 1 urban

12.85%

1.48%

2.09%

0.28%

0.59%

0.09%

Stratum 1 rural

12.39%

1.11%

3.26%

0.39%

1.29%

0.18%

Stratum 2 urban

16.25%

1.77%

2.49%

0.37%

0.61%

0.11%

Stratum 2 rural

16.85%

2.11%

3.94%

0.56%

1.49%

0.23%

Table 4: Poverty indices, LSMS 2002, 2005 and 2008. Stratum

2002 2005 Poverty measure Urban Rural Total Urban Rural Total

FGT(0) FGT(1) FGT(2) FGT(0) Central FGT(1) FGT(2) FGT(0) Mountain FGT(1) FGT(2) FGT(0) Tirana FGT(1) FGT(2) FGT(0) Coast

20.2 5.4 2.1 19.3 3.8 1.2 24.7 6.5 2.6 17.8 3.8 1.3 19.5

20.9 3.6 1.0 28.5 6.5 2.1 49.5 12.3 4.4

29.6

20.6 4.4 1.5 25.6 5.7 1.8 44.5 11.1 4.1 17.8 3.8 1.3 25.4

11.6 2.0 0.6 12.5 3.0 1.2 17.1 3.6 1.1 8.1 1.6 0.5 11.2

19.7 4.1 1.3 25.9 6.0 2.1 27.7 5.5 1.7

24.2

16.2 3.2 1.0 21.2 5.0 1.8 25.6 5.1 1.5 8.1 1.6 0.5 18.5

2008 Urban Rural Total 10.7 2.7 1.0 10.3 1.9 0.6 14.7 3.2 1.2 8.7 1.2 0.2 10.1

15.0 2.5 0.6 10.9 1.9 0.4 29.8 6.2 1.8

14.6

13.0 0.2 0.7 10.7 1.9 0.5 26.6 5.6 1.7 8.7 1.2 0.2 12.4 18

Albania

FGT(1) FGT(2)

4.5 1.6

6.6 2.1

5.7 1.9

2.3 0.8

5.3 1.8

4.0 1.3

1.9 0.6

2.6 0.7

2.3 0.7

Source: INSTAT (2009), Albania: trends in poverty 2002-2005-2008.

Table 5 reports the measures calculated at Prefecture level: we note that poverty measures are very spatially heterogeneous among Prefectures; in the Prefectures of Gjirokaster, Vlore, Tirane and Fier there are the lowest percentage of poor people, of around 10-11%. On the other hand, the Prefectures of Diber and Kukes seem to be the worse off with the highest percentage of poor individuals (respectively with 37.37 and 33.41%). Table 6 reports poverty measures disaggregated at District level, while the measures disaggregated at Commune/Municipality are not reported in the paper for sake of space. Figures 1, 2 and 3 report poverty maps, respectively for Prefectures, Districts and Municipalities, where the index of interest is the Head Count Ratio FGT(0). In each map, the areas have grouped in four categories, identifying three thresholds: the middle threshold is the Head Count Ratio calculated for the whole Albania (16.58%), which identifies two groups. The bottom threshold corresponds to the median value of the less deprived Districts (13%), while the top threshold corresponds to the median value of the most deprived Districts (25%), both calculated on the basis of the distribution at Prefecture level. The same principle has been used for constructing the poverty maps corresponding to the Poverty Gap Ratio for Prefectures, Districts and Municipalities, reported in Figures 4, 5 and 6. Finally the same applies for the Severity Index FGT(2) in Figures 7, 8 and 9.

Table 5: Poverty indices by PREFECTURE (%) FGT0_est

FGT0_se

FGT1_est

FGT1_se

FGT2_est

FGT2_se

1: BERAT

15.85%

1.83%

2.70%

0.38%

0.79%

0.12%

2: DIBËR

37.37%

2.09%

6.88%

0.63%

1.88%

0.22%

3: DURRËS

15.14%

1.05%

3.41%

0.31%

1.25%

0.14%

4: ELBASAN

19.40%

1.40%

3.12%

0.31%

0.85%

0.10%

5: FIER 6: GJIROKASTËR

11.66%

1.12%

2.14%

0.24%

0.67%

0.08%

10.89%

1.16%

1.81%

0.24%

0.54%

0.08%

7: KORÇË

14.22%

1.64%

2.44%

0.34%

0.74%

0.11%

8: KUKËS

33.41%

2.73%

5.39%

0.71%

1.30%

0.22%

9: LEZHË

17.02%

1.51%

3.30%

0.37%

1.06%

0.14%

19

10: SHKODËR

19.17%

1.96%

3.93%

0.52%

1.30%

0.21%

11: TIRANË

11.71%

0.66%

1.80%

0.14%

0.47%

0.04%

12: VLORË

11.30%

0.90%

2.41%

0.25%

0.84%

0.11%

Figure 1. FGT(0) at Prefecture level.

Figure 2. FGT(0) at District level.

Figure 3. FGT(0) at Municipality level.

20

Figure 4. FGT(1) at Prefecture level.

Figure 5. FGT(1) at District level.

Figure 6. FGT(1) at Municipality level.

21

Table 6: Poverty indices by DISTRICT (%) FGT0_est

FGT0_se

FGT1_est

FGT1_se

FGT2_est

FGT2_se

1: BERAT

16.48%

1.88%

2.78%

0.40%

0.81%

0.13%

2: BULCUIZË

33.69%

3.53%

5.43%

0.86%

1.31%

0.27%

3: DELVINË

7.41%

1.49%

1.42%

0.37%

0.46%

0.14%

4: DEVOLL

12.18%

2.07%

1.78%

0.36%

0.47%

0.11%

5: DIBËR

50.39%

3.11%

9.63%

1.02%

2.63%

0.37%

6: DURRËS

14.78%

1.20%

3.58%

0.39%

1.37%

0.18%

7: ELBASAN

15.59%

1.52%

2.83%

0.36%

0.88%

0.13%

8: FIER

11.06%

1.03%

2.10%

0.24%

0.67%

0.09%

9: GRAMSH 10: GJIROKASTËR

28.25%

3.04%

3.86%

0.64%

0.81%

0.17%

9.16%

0.99%

1.57%

0.20%

0.48%

0.07%

11: HAS

44.93%

3.65%

7.96%

1.06%

2.05%

0.35%

12: KAVAJË

12.77%

1.39%

2.50%

0.32%

0.84%

0.12%

13: KOLONJË

14.68%

2.37%

2.60%

0.53%

0.83%

0.19%

14: KORÇË

13.27%

1.53%

2.35%

0.33%

0.74%

0.12%

15: KRUJË

16.18%

1.75%

2.93%

0.42%

0.91%

0.15%

16: KUÇOVË

11.34%

1.58%

1.97%

0.33%

0.61%

0.12%

17: KUKËS

36.83%

3.13%

5.92%

0.83%

1.41%

0.26%

18: KURBIN

14.68%

1.59%

3.03%

0.43%

1.05%

0.18%

19: LEZHË

14.61%

1.83%

2.76%

0.45%

0.88%

0.17%

20: LIBRAZHD

26.82%

3.34%

3.51%

0.65%

0.71%

0.16%

21: LUSHNJË 22: MALËSI E MADHE 23: MALLKASTËR

12.06%

1.41%

2.17%

0.31%

0.67%

0.11%

15.53%

1.87%

2.70%

0.41%

0.79%

0.15%

13.29%

1.94%

2.29%

0.45%

0.67%

0.16%

24: MAT

21.70%

2.41%

4.04%

0.60%

1.21%

0.21%

25: MIRDITË

24.89%

2.58%

4.71%

0.70%

1.39%

0.26%

26: PEQUIN

19.04%

2.35%

3.34%

0.51%

1.00%

0.18%

27: PËRMET

11.80%

1.59%

1.91%

0.32%

0.56%

0.11%

28: POGRADEC

17.03%

1.92%

2.89%

0.42%

0.84%

0.14%

29: PUKË

32.21%

3.34%

6.64%

0.96%

2.08%

0.37%

30: SARANDË

8.92%

1.15%

1.74%

0.32%

0.57%

0.13%

31: SKAPRAR

18.48%

2.62%

3.19%

0.57%

0.94%

0.19%

32: SHKODËR

17.47%

1.93%

3.67%

0.51%

1.26%

0.21%

33: TEPELENË

13.09%

1.64%

2.14%

0.33%

0.62%

0.11%

34: TIRANË

11.55%

0.68%

1.69%

0.14%

0.41%

0.04%

35: TROPOJË

17.51%

2.07%

2.40%

0.41%

0.51%

0.11%

36: VLORË

12.15%

1.01%

2.64%

0.29%

0.94%

0.12%

22

Figure 7. FGT(2) at Prefecture level.

Figure 8. FGT(2) at District level.

Figure 9. FGT(2) at Municipality level.

It is interesting comparing the poverty mapping results of year 2008 with those obtained in year 2002 by Betti et al. (2003). Here the grouping has been performed using the three thresholds calculated in

23

the 2002 Poverty Mapping (Betti, 2003). For year 2002, the thresholds were calculated using the same procedure described in section 1 above, but independently for each one of the three distribution. Figures 10 and 11 correspond to the Head Count Ratio FGT(0) at Prefectures level, while Figures 12, 13 and 14 correspond to the Head Count Ratio FGT(0) at District level (Figure 13 with thresholds calculated on the basis of the distribution of Districts, as in 2002, and Figure 14 with thresholds calculated on the basis of the distribution of Prefectures, as in 2008). Finally, Figures 15, 16 and 17 correspond to the Head Count Ratio FGT(0) at Municipality level (Figure 16 with thresholds calculated on the basis of the distribution of Municipalities, as in 2002, and Figure 17 with thresholds calculated on the basis of the distribution of Prefectures, as in 2008).

Figure 10. FGT(0) at Prefecture level, 2002.

Figure 11. FGT(0) at Prefecture level, 2008.

Source: Betti et al. (2003).

24

Figure 12. FGT(0) at District level, 2002.

Figure 13. FGT(0) at District level, 2008.

Source: Betti et al. (2003). Figure 14. FGT(0) at District level, 2008, threshold based on Prefectures 2002.

25

Figure 15. FGT(0) at Municipality level, 2002.

Figure 16. FGT(0) at District level, 2008.

Source: Betti et al. (2003). Figure 17. FGT(0) at Municipality level, 2008, threshold based on Prefectures 2002.

26

6. Conclusions and recommendations

In this paper we have estimated various measures of welfare for small administrative units in Albania, combining the 2001 Population and Housing Census with the 2002 and 2008 Living Standards Measurement Study survey data, in order to get updated measures to the year 2008. Our estimates of poverty at the Stratum level, the level at which the household survey is representative, are quite comparable to those calculated using the sample survey; in fact welfare ranking of the four strata are completely consistent, with the only exemption of Stratum 2. Welfare rankings of any administrative unit using various measures of poverty are consistent as well; poverty is more pronounced and less heterogeneous in rural areas than in urban areas. We have produced poverty rates very precise at Stratum, Prefecture and District level, and rates that are precise enough to be of value to researchers and policy-makers even at Communes level. Some interesting findings of this report could be potentially very useful for policy-makers; they are reported as follows: as already highlighted in Betti et al. (2003) in Albania there is considerable heterogeneity of poverty rates across administrative units; this heterogeneity is very pronounced between rural and urban areas. When we compare different level of disaggregation we observe a large spatial heterogeneity among Prefectures and among Municipalities within the District to which they belong. This spatial heterogeneity is much less pronounced among Districts within the same Prefecture. We therefore conclude this report recommending policy-makers, once again, to focus their attention on “Large scale” project at Prefecture level and more specific and “well oriented” project at Commune/ Municipality level rather than at District level.

27

References

Betti, G., Ballini, F. and Neri, L. (2003). Poverty and Inequality Mapping in Albania, Report to the World Bank. Dabalen, A. and Ferrè, C. (2008), Updating Poverty Maps: A Case Study of Albania, mimeo, World Bank. Deaton, A. (1997). The Analysis of House.hold Surveys: A Microeconometric Approach to Development Policy. John Hopkins Press and The World Bank: Washington, D.C. Elbers, C., Lanjouw, J.O. and Lanjouw, P. (2002). Micro-level Estimation of Welfare, Working Paper n. 2911, The World Bank, Washington, D.C. Elbers, C., Lanjouw, J.O. and Lanjouw, P. (2003). Micro-level Estimation of Poverty and Inequality. Econometrica, 71(1), 355-364. INSTAT (2000). General Census of Agricultural Holdings 1998. INSTAT (2002). The Population of Albania in 2001. Lemieux, T. (2002). Decomposing changes in wage distributions: a unified approach, Canadian Journal of Economics, 35(4), 646-688. Levinson, R. (2001). Sample design for the 2002 Living Standards Measurement Survey (LSMS), Final Report to the World Bank, November 2001. Neri, L., Ballini, F. and Betti, G. (2005). Poverty and inequality mapping in transition countries, Statistics in Transition, 7(1), 135-157. Raghunathan, T.E., Lepkowski, J., Van Voewyk, J. and Solenberger P. (2001). A Multivariate Technique for Imputing Missing Values Using a Sequence of Regression Models, Survey Methodology, 27, 85-95. The World Bank (2002). Basic Information Document, Living Standard Measurement Study, Albania, Development Research Group. The World Bank (2003). Construction of the consumption aggregate and estimation of the poverty line, LSMS 2002 – Albania.

28