Sampling Algorithm of Order Statistics for Conditional ...

1 downloads 0 Views 762KB Size Report
of the sample distribution for the prediction of finite population totals under single-stage ... expectation under the sample distribution and the sampling weights.
Bayes Estimation and Prediction under Informative Sampling Design Abdulhakeem A.H. Eideh Department of Mathematics Al-Quds University, Al-Quds, Palestine [email protected]

Abstract In this research we will deal with the problem of Bayes estimation of the parameter that characterise the superpopulation model, and Bayes prediction of finite population total, from a sample survey data selected from a finite population using informative probability sampling design, that is, the sample first order inclusion probabilities depend on the values of the model outcome variable (or the model outcome variable is correlated with design variables not included in the model). In order to achieve this we will first define the sample predictive distribution and the sample posterior distribution and then we use the sample posterior likelihood function to obtain the sample Bayes estimate of the superpopulation model parameters, and Bayes predictors of finite population total. These new predictors take into account informative sampling design. Thus, provides new justification for the broad use of best linear unbiased predictors (model-based school) in predicting finite population parameters in case of not accounting of complex sampling design. Furthermore, we show that the behaviours of the present estimators and predictors depends on the informativeness parameters. Also the use of the Bayes estimators and predictors that ignore the informative sampling design yields biased Bayes estimators and predictors. One of the most important feature of this paper is, specifying prior distribution for the parameters of the sample distribution makes life easier than the population parameters.

Keywords: Bayes estimator, Finite Population Sampling, Informative Sampling, Sample Distribution. 1. Introduction Survey data may be viewed as the outcome of two processes: the process that generates the values of units in the finite population, often referred as the superpopulation model, (values of units in the finite population is a random sample from superpopulation model) and the process of selecting the sample units from the finite population, known as the sample selection mechanism. Analytic inference from sample survey data refers to the superpopulation model, while descriptive inference deals with estimation of finite population characteristics. When the sample selection probabilities depend on the values of the model response (or the model outcome variable is correlated with design variables not included in the model), even after conditioning on the auxiliary variables, the sampling mechanism becomes informative, and the selection effects need to be accounted for in the inference process. To overcome the difficulties associated with the use of classical inference procedures for cross sectional survey data, Pfeffermann, Krieger and Rinott (1998) proposed the use of the sample distribution induced by assumed population models, under informative sampling, and developed expressions for its calculation. Similarly, Eideh and Nathan (2006) fitted time series models for longitudinal survey data under informative sampling. Furthermore, to see examples in the field of cross sectional surveys and panel surveys that illustrating the effects of ignoring informative sampling design, please review the books edited by: Skinner, Holt and Smith (1989), and Chambers and Skinner (2003). In particular see the articles by Pfeffermann (2011), Pfeffermann and Sverchkov (1999, 2003), Sverchkov and Pfeffermann (2004), Eideh and Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Abdulhakeem A.H. Eideh

Nathan (2006a, 2006b, 2009), Eideh (2003, 2012a, 2012b). In these articles the authors reviewed many examples reported in the literature that illustrate the effect of ignoring the sampling process when fitting models to survey data based on complex sample and they discussed methods to deal with this problem. Kim (2002) considers Bayesian and Empirical Bayesian approach to small area estimation under informative sampling. Sverchkov and Pfeffermann (2004) study the use of the sample distribution for the prediction of finite population totals under single-stage sampling. They propose predictors that employ the sample values of the target study variable, the sampling weights of the sample units and possibly known population values of auxiliary variables. They solve the prediction problem by estimating the expectation of the study values for units outside the sample as a function of the corresponding expectation under the sample distribution and the sampling weights. The prediction mean square error is estimated by a combination of an inverse sampling procedure and a resampling method. An interesting outcome of the present analysis is that several familiar estimators in common use are shown to be special cases of the proposed approach, thus providing them a new interpretation. They study the performance of the new and some old predictors in common use is evaluated and compared by a Monte Carlo simulation study using a real data set. In their article they use the sample and sample-complement distributions for developing design consistent predictors of finite population totals. Known predictors in common use are shown to be special cases of the present theory. The mean square errors (MSEs) of the new predictors are estimated by a combination of an inverse sampling algorithm and a resampling method. As supported by theory and illustrated in the empirical study, predictors of finite population totals that only require the prediction of the outcome values for units outside the sample perform better than predictors in common use even under a design based framework, unless the sampling fractions are very small. The MSE estimators are shown to perform well both in terms of bias and when used for the computation of confidence intervals for the population totals. The author pointed that “Further experimentation with this kind of predictors and MSE estimation is therefore highly recommended”. Nandram et al. (2006) study the problem in which a biased sample is selected from a finite population (a random sample from a super-population), and inference is required for the finite population mean and the superpopulation mean. However, the measurements may not be normally distributed, a necessary condition for their method; a transformation, which can possibly provide a normal approximation, is needed. The authors use the Gibbs sampler and the sample importance resampling algorithm to fit the non-ignorable selection model to a simple example on natural gas production. Their non-ignorable selection model estimates the finite population mean production much closer to the true finite population mean than a model which ignores the selection probabilities, and there is improved precision of the non-ignorable selection model over this latter model. A naive 95% credible interval based on the Horvitz–Thompson estimator is too wide. Rao (2011) provides an account of both parametric and nonparametric Bayesian (and pseudo-Bayesian) approaches to inference from survey data, focusing on descriptive finite population parameters. Little (2012) characterize the prevailing philosophy of official statistics as a design/model compromise (DMC). It is design-based for descriptive inferences from large samples, and modelbased for small area estimation, nonsampling errors such as nonresponse or measurement error, and some other subfields like autoregressive integrated moving average (ARIMA) modeling of time series. Little suggests that DMC involves a form of “inferential 328

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

schizophrenia”, and offer examples of the problems this creates. He discuss an alternative philosophy for survey inference which is called calibrated Bayes (CB), where inferences for a particular data set are Bayesian, but models are chosen to yield inferences that have good design-based properties. Little argue that CB resolves DMC conflicts, and capitalizes on the strengths of both frequentist and Bayesian approaches. Features of the CB approach to surveys include the incorporation of survey design information into the model, and models with weak prior distributions that avoid strong parametric assumptions. Savitsky and Toth (2016) propose to construct a pseudo-posterior distribution that utilizes sampling weights based on the marginal inclusion probabilities to exponentiate the likelihood contribution of each sampled unit, which weights the information in the sample back to the population. They construct conditions on known marginal and pairwise inclusion probabilities that define a class of sampling designs where consistency of the pseudo posterior is guaranteed. In this paper we will account for informative probability sampling design in the Bayesian approach to sample survey inference, for example Bayes estimation of the parameters of superpopulation model and prediction of finite population total. The plan of the paper is as follows, Section 2 is devoted to the definitions of sample and sample-complement distributions. In Section 3 we define the sample posterior distribution. In Section 4, we develop the sample posterior likelihood and Bayesian inference for superpopulation parameter. The Bayesian approach are studied for normal and Bernoulli superpoulation models in Sections 5 and 6. In Section 7, we propose a new Bayes prediction of finite population total. Finally some conclusions are presented in Section 8. 2. Sample and Sample-Complement Distributions Let U  1,..., N  denote a finite population consisting of N units. Let y be the study variable of interest and let yi be the value of y for the ith population unit. We consider the population values y1 ,..., y N as random variables, which are independent realizations from a distribution with probability density function (pdf) f p  yi |   , indexed by a vector  of parameters    , where  is the parameter space of  . Let x i  xi1 ,..., xip  , i  U

be the values of a vector of auxiliary variables, x1 ,..., x p , and z  z1 ,..., z N  be the values of known design variables, used for the sample selection process not included in the model under consideration. In what follows, we consider a sampling design with selection probabilities  i  Pr(i  s) , and sampling weight wi  1  i ; i  1,..., N . In practice, the  i ’s may depend on the population values dependence by writing:  i  Pr(i  s | x, y, z)

x, y, z  .

We express this

for all units i  U . Since  1 ,..., N are

defined by the realizations xi , yi , z i , i  1,..., N , therefore they are random realizations defined on the space of possible populations. The sample s consists of the subset of U selected at random by the sampling scheme with inclusion probabilities  1 ,..., N .  Denote by I  I1 ,..., I N  the N by 1 sample indicator (vector) variable, such that I i  1 Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

329

Abdulhakeem A.H. Eideh

if unit i  U is selected to the sample and I i  0 if otherwise. The sample s is defined

accordingly as s  i | i  U , I i  1 and its complement by c  s  i | i  U , I i  0 . We assume probability sampling, so that  i  Pr(i  s)  0 for all units i  U . Let f p and E p  denote the pdf and the mathematical expectation of the population distribution, respectively; f s and E s  denote the pdf and the mathematical expectation

of the sample distribution, respectively; and f c and E c  denote the pdf and the mathematical expectation of the sample-complement distribution. Assume that the population pdf depends on known values of the auxiliary variables x i , so that yi ~ f p yi | x i , p . According to Pfeffermann, Krieger and Rinott, Y. (1998), the (marginal) sample pdf of yi is given by:

f s  yi | x i , ,    f p  yi | x i , ,  , i  s  

where

E p  i | x i , y i ,   f p  y i | x i ,  

(1)

E p  i | x i ,  ,  

E p  i | x i , ,     E p  i | x i , y i ,   f p  y i | x i , dyi

and  is the informativeness parameter.

Similarly, Sverchkov and Pfeffermann (2004) define the sample-complement pdf of yi as: E p 1   i | x i , yi  f p  yi | x i  (2) f c  yi | x i   E p 1   i | x i  For vector of random variables  y i , x i  , the following relationships hold:

Es wi | yi   E p  i | yi 

(3a)

E p  yi | x i   Es wi | x i  Es wi yi | x i 

(3b)

1 1

Es wi   E p  i 

1

Ec  y i | x i  

E p 1   i  yi | xi  E p 1   i  | xi 

(3c) 

E s wi  1 yi | xi  E s wi  1 | xi 

(3d)

See Pfeffermann and Sverchkov (1999) and Sverchkov and Pfeffermann (2004). Having derived the sample distribution, Pfeffermann, Krieger and Rinott (1998) proved that if the population measurements yi are independent, then as N   with n fixed  the sample measurements are asymptotically independent, so we can apply standard inference procedures to complex survey data by using the marginal sample distribution for each unit. Based on the sample data yi , x i , wi ; i  s, Pfeffermann, Krieger and Rinott (1998) proposed a two-step estimation method: 330

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

Step-one: Estimate the informativeness parameters  using equation (3a). Denoting the resulting estimate of  by ~ . Step-two: Substitute ~ in the sample log-likelihood function, and then maximize the resulting sample log-likelihood function with respect to the population parameters,  : n

n

i 1

i 1

l rs  , ~   l srs     logE p  i | x i , , ~   l srs     logE s wi | x i , , ~ 

(4)

where l rs  , ~  is the sample log-likelihood after substituting ~ in the sample logl srs     logf p  y i | x i ,  is the classical logn

likelihood function and where

i 1

likelihood. For more discussion on the use of sample and sample-complement distributions for analytic inference, see Pfeffermann and Sverchkov (1999, 2003), Sverchkov and Pfeffermann (2004), Eideh and Nathan (2009) and Eideh (2012a, 2012b). 3. Sample Posterior Distribution In this section we describe Bayesian approach to the problem of estimation. This approach takes into account any prior knowledge of the survey that the sampler has. Let q   be the prior pdf of  , here we look upon  as a possible value of  . Assume that the sample conditional pdf of yi given    is f s  yi |   . The informativess parameter (or the parameter of sample selection process)  that index the selection model is a characteristic of the data collection but is not generally of scientific interestis and it is not identifiable (see Nandram et al. (2006)), therefore in this paper we treat  as fixed in repeated sampling. So that f s  yi | x i , ,   can be written as

f s  yi |   . For more information on estimation of  , see Eideh (2012a).

We next define the sample joint pdf of yi and    , Bayes sample marginal or Bayes sample predictive pdf of yi , and sample posterior distribution. (a) The sample joint pdf of yi and  denoted by, ms  yi ,  is defined by:

ms  yi ,   q  f s  yi |  

(5)

(b) An important quantity in Bayesian inferences and statistical decision theory is the marginal or Bayes predictive distribution of Y , which plays an important role in various scientific applications. The Bayes sample marginal or Bayes sample predictive or Bayes sample predictive pdf of yi , denoted by, ms  yi  is given by:  m s  y i ,  d  q  f s  y i |  d    m s  y i     m s  y i ,     q  f s  y i |     Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

if  is continuous if  is discrete

(6)

331

Abdulhakeem A.H. Eideh

The Bayes sample predictive pdf of yi presents the evidence for a particular model, defined by prior distribution q   known as prior predictive as it represents the probability of observing the data yi that was observed before it was observed. (c) Now we introduce the sample posterior distribution, denoted by qs  | yi  , which is a way, after (posterior to) observing yi , for adjusting or updating the prior distribution

q   (the beliefs about    prior to experimentation or prior to survey) to qs  | yi  . Accordingly, all parametric statistical inferences about the parameters of interest, say    , are derived from the posterior sample distributions so it represents a complete solution to the inference problem.

The sample posterior distribution can be viewed as a way of combining the prior information q   and the sample information f s  y |   , in other words, qs  | x  contains all of the information combining prior knowledge and observations. The sample posterior distribution of    given yi , denoted by q s  y i  is defined by: q s  | y i    

m s  y i ,  q  f s  y i |    ms  yi  ms  yi 

q E p  i | y i ,   f p  y i |  

(7)

m s  y i E p  i |  

q  Pri  s | y i ,   f p  y i | x i ,  m s  y i  Pri  s |  

Note that: (i)

If the sampling design is noninformative, that is, Pri  s | yi ,    Pri  s |   for all yi , then the sample joint pdf of yi and  is the same as the population joint pdf of yi and  , the Bayes sample predictive pdf of yi posterior distribution is the same as the Bayes population predictive pdf of yi , and the sample posterior distribution of    given yi is the same as the population posterior distribution of    given yi . So that data collection method (sampling design) does not influence Bayesian inference.

(ii)

The Bayes sample predictive pdf of yi , ms  yi  , represents the normalizing constant in Bayes theorem, so that q s  | y i  

q   f s  y i |   ms  yi 

(8)

is called normalized sample posterior distribution, while q s  | yi   q  f s  yi |  

(9)

is the unnormalized one. 332

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

(iii)

The sample posterior distribution (the distribution of  after the sample is drawn) can be viewed as a way of combining the prior information q   (which reflects the subjective belief of  before the sample is drawn), the sampling design, Pri  s | yi ,   , and the population information f p  yi |   . In other words,

q s  y i  contains all of the information combining prior knowledge, sampling design and observations. (iv)

With fixed f p  yi |   and q   , the sample posterior distribution is completely determined by E p  i | yi ,   . For more information on this issue, see Pfeffermann, Krieger and Rinott (1998), and Eideh (2010). Hence, sample posterior distribution combine: population distribution f p  yi |   , prior distribution q   and E p  i | yi ,   .

(v)

Furthermore, qs 1 | yi  q1  q1  f s  yi | 1   qs  2 | yi  q 2  q 2  f s  yi |  2 

(10)

That is, the sample posterior odds are equal to prior odds, q 1  q  2  , multiplied by the sample likelihood ratio, Ls 1 | y i  f s  y i | 1   Ls  2 | y i  f s  y i |  2 

4.

(11)

Sample Posterior Likelihood and Bayesian Inference for Superpopulation Parameter

Data collected by sample surveys, are used extensively to make inferences on assumed population models. Often, survey design features (clustering, stratification, unequal probability selection, etc.) are ignored and the sample data are then analyzed using classical methods based on simple random sampling. In this section we account for informative sampling design in the analysis of survey data using Bayesian inference.

 If y s   y1 , y n  are independent and identical random variables from f s  yi |   , then  we can write the sample joint conditional pdf of y s   y1 , y n  given    , as: f s y s |    Ls y s     f s  yi |   n

(12)

i 1

 Thus sample joint pdf of y s   y1 , y n  and  is: ms y s ,   q Ls y s  

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

(13)

333

Abdulhakeem A.H. Eideh

If  is a continuous random variable, then Bayes sample predictive pdf of  y s   y1 , y n  is: ms y s    ms y s , d   q Ls y s  d 

(14)



Hence the conditional sample pdf of  s y s or the sample posterior likelihood:

qs  | y s   Ls  | y s  

q Ls y s  

(15)

ms y s 

 q Ls y s   Now we are in a position to deal with Bayesian inference, via the sample posterior distribution and the sample posterior likelihood. This provides the following representation of the sample posterior pdf of  | y s : Sample Posterior information = Prior information + Sampling design information + Sampling information And in terms of pdfs: Sample Posterior pdf = Prior pdf+ Sample Likelihood function = Prior pdf+ sampling design+ population pdf Therefore the sample posterior pdf of  | y s summarizes the total information, after viewing the sample data, collected under informative sampling design, and provides a basis for sample posterior or Bayesian inferences concerning the population parameters of interest. The generalized maximum likelihood estimator (MLE) ˆg   of    is the value of  that maximizes the sample posterior likelihood function, L  | y  . That is, ˆ   is s

s

g

 the most likely value of    , given the prior and the sample data y s   y1 , y n  . Obviously ˆ   has the interpretation of being the "most likely" value of  given the g

 prior and the sample data y s   y1 , y n  . From Bayesian viewpoint, estimating the population parameter  is obtained by choosing a decision function  y s  , which depends on a loss function L y s ,   , whose conditional expectation under the sample posterior distribution is minimum. That is,  y s   argmin E s L y s ,  y s  (16)  argmin  L y s , q s  | y s d 

334

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

If L y s ,    y s     (squared error loss function), then it is easy to show that, the sample Bayes estimate of    is given by: 2

 y s   ˆB  Es  y s 

(17)

ˆ B  Es   y s 

(18)

and the sample Bayes estimate of        is given by:

Moreover, the 100 1   % highest sample posterior density (HSPD) credible set for    , is the subset C of  of the form:

C     : q s  | y s   k  

(19)

 where y s   y1 , y n  , and k   is the largest constant such that: PC y s   1  

(20)

5. Application 1 – Normal Superpopulation Model According to the definition of sample posterior distribution, to consider the Bayesian inference, we need to specify: f p  yi |   and q   , and E p  i | yi ,   . From now on, we denote the parameter index the population distribution by  p , and the parameter index the sample distribution by  s . Consider the following population model:



yi |  p   p ~ N  p ,  2



(21)

be independent random variables, i  1,..., N , where  2  0 is known, i  1,..., N . Let  y s   y1 , y n  be the sample data. Suppose that

E p  i yi   exp a1 yi 

(22)

Using equation (1), it is easy to verify that the sample model is: yi |  s   s ~ N  s ,  2







where  s   p  a1   s  p , a1 ,  2

2

 is a linear function of 

(23) p

. So in order to find the

Bayes estimator of  p , we (for simplicity and to makes life easy) choose a prior distribution for  s and then derive the Bayes estimator for  p , via the relationship between  p and  s . We consider different cases based on prior distribution.









Case 1: Assume that the prior distribution for  s ~ N  , 2 with known  , 2 . Hence





in this case and the subsequent cases, a1 , , ,  is know, and the only unknown parameter is  p . Using the formulas in the previous sections, and doing some algebra we have: Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

2

2

335

Abdulhakeem A.H. Eideh

(a) The sample posterior distribution of  s y s is given by:

  2  n 2 y s  2 2  s y s ~ N  , 2 2 2   n    n 2 

  

(24)

where y s  n 1 i 1 yi . n

so that

 2  n 2 y s  2  n 2

(25)

  2 n  2 2 Vs  s y s   2  2 2 2 2  n     n

(26)

E s  s y s  

and





Now, for given a1 , 2 , 2 ,  and using  s   p  a1 2 , we have







E s  p y s  E s  s  a1 2 y s 

and





 (27)

 2  n 2 y s  a1 2 2 2   n



Vs  p y s  Vs  s  a1 2 y s  Vs  s y s  

 (28)

 2 2  2  n 2

Furthermore, the sample posterior distribution of  p y s is given by:

  2  n 2 y s  2 2 2  p y s ~ N   a  , 1 2 2  2  n 2    n

  

(29)

Hence the sample and population posterior pdfs of  p y s belong to the same family of normal distributions, but the mean under the sample posterior pdf is different from the mean under the population posterior pdf of  p y s . Furthermore if a1  0 , that is the sampling design is ignorable, then the sample and population posterior pdfs are similar.





(b) For given a1 , 2 , 2 ,  , the Bayes estimate of  p   under the sample posterior pdf is:  2  n 2 ys ˆpB  Es  p y s   a1 2  2  n 2 (30)    1    ys  a1    1    ys  a1 2  ˆ srs   a  2



pB

336



1

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

where

 n   n   2

2

(31)

2

and

ˆpB srs     1   y s is the Bayes estimate of  p   p under noninformative sampling design.

(32)

Note that: (1)

(2)

If a1  0 (that is, sampling design is noninformative), then the Bayes estimate of  p   based on the population model and sample model are coincide. If a1  0 and yi  0 , then E p  i | yi   e a1 yi is an increasing function of yi , so that larger values are more likely to be in the sample than smaller values. In this case, ˆ srs  overestimate    . The Bayes estimate ˆ  ˆ srs   a  2 pB

pB

p

pB

1

adjusts ˆpB srs  by the positive quantity a1 . 2

(3)

If a1  0 and yi  0 , then E p  i | yi   e a1 yi is a decreasing function of yi , so that smaller values are more likely to be in the sample than larger values. In this case, ˆpB srs  underestimate  p   p . The Bayes estimate ˆpB  ˆpB srs   a1 2 adjusts ˆ srs  by the negative quantity a  2 . 1

pB

(4)

ˆpB  a1 2  ˆsB is a weighted average of  (prior mean) and y s (the maximum likelihood estimator of  s   ), with weights  and 1   , respectively.

(5)

ˆpB can be written as: ˆ pB 

2

 n   2

2

ys 

 n  n   2

2

so that,

2



  a1 2



  2 n 2 lim ˆ pB  lim  2 y    a1 2  s 2 2 2 n  n   n   n    2  y s  a1









(33)

(34)

Hence, for large sample size n , the Bayes estimate of  p   p , shrinkage to y s  a1 2 . That is, as the sample size n increases, the influence of the prior distribution on posterior inferences decreases. This idea, sometimes referred to as asymptotic theory, because it refers to properties that hold in the limit as n becomes large. The large-sample results are not actually necessary for performing Bayesian data analysis but are often useful as approximations and as tools for understanding. See Gelman et al. 2009.

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

337

Abdulhakeem A.H. Eideh





(c) It is easy to verify that for given a1 , 2 , 2 ,  , the generalized MLE of  p   is given by:

ˆpg  y s    ys     a1 2

(35)

   1    y s  a1 2

which is the sample Bayes estimate of  p   p . (d) Furthermore, the 100 1   % highest sample posterior density (HSPD) credible set for  p , is given by:   2 2 2  C    1    y s  a1  z 2   2  n 2 





   

(36)

where is the 100 1   2 % percentile of the standard normal distribution. Note that if a1  0 (that sampling design is noninformative), then the 100 1   % highest sample posterior density (HSPD) credible set for  , is given by:   2 2  C    1    y s   z 2   2  n 2 

   

(37)

which is the classical 100 1   % highest posterior density (HPD) credible set for  p .

q s ;  s  

Not that

is a conjugate family for

q s  s | y i  is in the class of q s ;  s  .

 f s  yi |  s ;  s  ,



because



Case 2: Assume that the prior distribution for  s   p  a1 2 ~ N   a1 2 , 2 with known  , 2 , a1  . Under this case we have:





   a1 2  2  n 2 y s  2 2  s y s ~ N  , 2  2  n 2   n 2 

  

(38)

E s  s y s     1    y s   a1 2





(39)

 E p  p y s  a1 2 Vs  s y s  







 2 2  Vp  p y s  2  n 2



(40)



E s  p y s  E s  s y s   a1 2    1    y s  a1 2

338



(41)

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design





For given a1 , 2 , 2 ,  , the Bayes estimator of  s   under the sample posterior pdf is:

ˆsB  Es  s y s     1    y s   a1 2

(42)

where

 2 n      2 2   n     Hence, the Bayes estimator of  p   is:

ˆ p  y s    y s    na1 2 



   1    y s  a1 2

(43)







It is easy to verify that for given a1 , 2 , 2 ,  , the generalized MLE of  s  s is given by: (44) ˆsg    1    y s   a1 2 So that: ˆ pg  y s    y s    na1 2 



   1    y s  a1 2

(45)



The 100 1   % highest sample posterior density (HSPD) credible set for  s   , is given by:   2 2  C s     1    y s   a1 2  z 2 (46) 2 2     n    So that for  p   s  a1 2   is:   2 2 C     1    y s  a1 2  z 2   2  n 2 





   

(47)

Not that q s ;  s   is a conjugate family for  f s  yi |  s ;  s  , because q s  s | y i  is in the class of q s ;  s  .



Case 3: Assume that the prior distribution for  p ~ N  , 2 Similar to Cases 1 and 2, we have:

  2  n 2 y s  na1 2 2  2 2  p y s ~ N  , 2  2  n 2   n 2 

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

  







with known  , 2 .

(48)

339

Abdulhakeem A.H. Eideh

so that





 2  n 2 y s  na1 2 2 Es  p y s   2  n 2  2  n 2 y s  2 2   na 1  2  n 2  2  n 2







 E p  p y s  na1V p  p y s and



(49)





 2 2  2  n 2  2 n  2  V  y  2 p p s  n   2 which is similar to Case 2 (a). Vs  p y s 





(50)





(b) For given a1 , 2 , 2 ,  , the Bayes estimator of  p   under the sample posterior pdf is: ˆ pB  E s  p y s    1    y s  1   a1 2 (51)    1    y s  a1 2  where  2 n      2 2   n   which is similar to Case 2 (b).







Note that, here, ˆpB is a weighted average of  (prior mean) and y s  a1 2



(the

maximum likelihood estimator of  p   p ), with weights  and 1   , respectively.





(c) It is easy to verify that for given a1 , 2 , 2 ,  , the generalized MLE of  p   is given by: (52) ˆpg  y s   y s    na1 2





which is the sample Bayes estimate of  p   - similar to Case 2 (c). (d) The 100 1   % highest sample posterior density (HSPD) credible set for  p , is given by:   2 2  C     1    y s  a1 2   z 2 (53) 2 2     n    which is similar to Case 2 (d).

340

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

Note that if a1  0 (that sampling design is noninformative), then the 100 1   % highest sample posterior density (HSPD) credible set for  p , is given by   2 2  C     1    y s   z 2   2  n 2   which is the 100 1   % highest posterior density (HPD) credible set for  p .









(54)

Hence, the prior distribution for  p ~ N  , 2 with known  , 2 is equivalent to the









prior distribution  s   p  a1 2 ~ N   a1 2 , 2 with known  , 2 , a1 , in the sense that, they give the same predictor of  p   . Not that

q ;  p

p

  is a conjugate family for

q s  p | yi  is in the class of q p ;  p  .

 f s  yi |  s ;  s  ,

because

It should be noted that, the similarity between Case 2 and Case 3 is interpreted as  p ~ N  , 2 , follows: since under Case 3, therefore









 s   p  a1 2 ~ N   a1 2 , 2 , which is the prior distribution under Case 2. The





situation is different under Case 1, where  s   p  a1 2 ~ N  , 2 . Case 4: Suppose that the prior distribution of  s is noninformative or improper:

q s   1, -    s  

(55)

(a) Since y s is a sufficient statistics for  s   p  a1 2 , therefore, the conditional sample pdf of  s y1 ,..., y n or the sample posterior likelihood is:

q s  s | y s   q s Ls  s | y s 

 q s  | y s q s   Ls  | y s  1





2  2

(56)

 n  y s   s 2  exp  2  2  n



Hence the posterior distribution of  s y s is given by:

 2  s y s ~ N  y s , n  so that

  

E s  s y s   y s

(57)

(58)

and Vs  s y s  

2 n

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

(59)

341

Abdulhakeem A.H. Eideh

We are interested in sample posterior distribution of  p y s . Since  p   s  a1 2 , therefore, sample posterior distribution of  p y s is given by:

 2  p y s ~ N  y s  a1 2 , n  so that





E s  p y s  y s  a1 2



  

(60)



(61)

 E p  p y s  a1 2

and





Vs  p y s 

2 n



 Vp  p y s



(62)

Note that the sample and population posterior pdfs of  p y s belong to the same family of normal distributions, but the mean under the sample posterior pdf is different from the mean under the population posterior pdf of  p y s . Furthermore if a1  0 , that is the sampling design is ignorable, then the sample and population pdfs are similar.





(b) For given a1 , 2 , the Bayes estimator of  p   under the sample posterior pdf is (63) ˆ pB  E s  p y s  y s  a1 2





which is the same as the maximum likelihood estimator of  p   under the sample





model: yi |  s   s ~ N  s ,  2 where  p   s  a1 2 .





(c) It is easy to verify that for given a1 , 2 , the generalized MLE of  p   is given by: (64) ˆsg  ys  a1 2 (d) The 100 1   % highest sample posterior density (HSPD) credible set for  p , is given by:   2  C    ys  a1 2   z 2 (65)   n   Note that if a1  0 (that sampling design is noninformative), then the 100 1   % highest sample posterior density (HSPD) credible set for  p , is given by:   2  C    ys   z 2 (66)   n   which is the 100 1   % highest posterior density (HPD) credible set for  . Also it is the 100 1   % confidence interval for  p . 342

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

Not that

q s ;  s  

is a conjugate family for

q s  s | y i  is in the class of q s ;  s  .

 f s  yi |  s ;  s  .

Because

6. Application 2 – Bernoulli Superpopulation Model

yi |  p   p ~ Ber p  are independent random variables, where  p  0,1 , i  1,..., N . Suppose that the sample inclusion probabilities have expectations: E p  i | yi   Pri  s yi   exp a0  a1 yi , a0 , a1  0

Let

if yi  0  exp a0   0  exp a0  a1   1 if yi  1

(67)

So that, a0  ln 0 and a1  ln 1 0  . In this case 0 and 1 are known before sampling. We can show that the sample distribution of yi is: f s  yi  s  

Pri  s yi  f p  yi  p  Pri  s 

   p 1       1     p 0   p 1   s  i 1   s 

1 yi

y

yi

 1   p 0       1     p 0   p 1

1 yi

(68)

, yi  0,1

Thus the distribution of yi , i  s is the same as the distribution of yi , i  U , except that the population parameter  p changes to:

 p 1   s  p , 1 , 0    s  p   p 1  1   p 0 which is not a linear function of  p . s 

(69)

Notice that if a1  0 , or equivalently 0  1 , that is, the sampling design is ignorable then the sample and population distributions are the same. Case 1: Now suppose that the prior distribution of  p is  p ~ B ,   , where  ,   are known. The conditional sample pdf of  p y s or the sample posterior likelihood is given by:





q s  p | y s   q  p Ls y s  p ,  p  

where

q  L y p

s

s

p

  q  f y n

p

s

i

(70)

|p 

i 1

  p 1 1   p  1 1   p  1  Beta ,     p 1  1   p  0 Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

   

ny s

 1   p  0      1    p 0  p 1

   

n  ny s i

(71)

343

Abdulhakeem A.H. Eideh

which cannot be written in the form of beta distribution. Not that q p ;  p   is not a conjugate family for

qs  p | yi 

is

not

in

the

f  y q ;   .

class

p

p

s

 s   p 1   p 1  1   p  0  is not a linear function of  p .

i

|  p ;  p  , because

Recall

that

Case 2: To makes life easy, assume that the prior distribution of  s is  s ~ B ,   . It is easy to verify that, the conditional sample pdf of  s y s or the sample posterior likelihood is given by: (72) q s  s | y s   q s Ls y s  s ,  s   where q s Ls y s  s   q s  f s  y i |  s  n

i 1

(73)

1   s   nys 1 1   s   n nys 1 Beta ,  

Hence, q s  s | y s  

1  s   nys 1 1   s   n  nys 1 Beta ,  

(74)

After some algebra, we can show that the sample posterior distribution of  s y s is given by:  s y s B  ny s ,   n  ny s  (75) So that E s  s y s  

  ny s

  ny s  n  ny s     ny s    n

and Vs  s y s  

  ny s n  ny s        n 2     n  1

q s ;  s   is a conjugate q s  s | yi  is in the class q s ;  s  .

Not that

(76)

(77)

family for

 f s  yi |  s ;  s  .

Because

Hence, for given  ,  ,  0 , 1  , the Bayes estimate of  s   under the sample posterior pdf is given by:   ny s (78) ˆsB  E s  s y s     n

344

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

But we are interested in the Bayes estimate of  p   . For this, using the relationship between  s and  p , we get: ˆsB  0 ˆ pB  ˆsB  0  1  ˆsB 1





(79)

 0   ny s    0   ny s   1 n  ny s   

Notice that if 0  1 , that is, the sampling design is ignorable, then

  ny s  ˆ pB srs    n which is the classical Bayes estimate of  p under simple random sample. ˆ pB 

(80)

The Bayes estimate ˆpB can be written as:      n 0  0      y s    ˆ pB     0   ny s   1 n  ny s       0   ny s   1 n  ny s       

(81)

which is a linear combination of y s - maximum likelihood estimator of  p , under the

 

population model yi |  p   p ~ Ber  p , and      - mean of s : B ,   . Furthermore, we have:    n 0  y s      0   ny s   1 n  ny s      lim ˆ pB  lim   n  n      0           ny    n  ny         s 1 s   0  0 ys   0 y s  1 1  y s 

(82)

Hence, for large sample size n , the Bayes estimate of  p   p , shrinkage to:

0 ys  ˆ p inf  (83)  0 y s  1 1  y s  where ˆp inf  denotes the maximum likelihood estimate of  p under informative sampling design, treating  p as fixed unknown constant. Notice that if 0  1 , that is, the sampling design is ignorable, then lim ˆ pB  y s . n 

The important feature of this formula is that, if  p , (consequently s ) is treated as fixed unknown constant (classical or frequentist statistics), that is,     0 , and the sampling

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

345

Abdulhakeem A.H. Eideh

design is informative, then for large sample size n , the Bayes estimate of  p becomes

 0 y s   0 y s  1 1  y s .

It is easy to verify that for given  ,   , the generalized MLE of  s  s is given by:   ny s ˆsg    n (84)        n    y s        n      n   

which is the sample Bayes estimate of  s   . Hence using the relationship between  s and  p , we have

ˆ pg 

 0   ny s   0   ny s   1 n  ny s   

(85)

Notice that if 0  1 , that is, the sampling design is ignorable then, then   ny s ˆsg   ˆpg   n which the Bayes estimate of  p   p under noninformative sampling design.

(86)

The important feature of this formula is that, if  p is treated as fixed unknown constant (classical or frequentist statistics), that is,     0 , and the sampling design is noninformative, then for large sample size the Bayes estimate of  p becomes y s which is the MLE of  p . 7. Bayes Prediction of Finite Population Total Sverchkov and Pfeffermann (2004) use sample and sample complement distributions for the prediction of finite population totals under informative sampling for single-stage sampling designs. Later Eideh and Nathan (2009) extend the theory to general linear functions of the population values and to two-stage informative cluster sampling. None of the above studies consider prediction of finite population total from Bayesian perspective. In this section we use sample posterior and sample complement posterior distributions, to predict the finite population total under single-stage informative sampling design. Let N

T   yi   yi   yi   yi   yi i 1

is

ic

is

(87)

is

The available information for the prediction process O   yi ,  i , i  s I i , i  U , N , n , where I i  1 for i  s and I i  0 for i  s .

is

Now, following the notations in Sections 2 and 3, the sample complement posterior distribution of  p   p given yi , denoted by q c  p y i is defined by:



346



Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

qc  p | yi  

q p  f c  yi |  p 

(88)

mc  yi 

where, f c  yi |   is the sample complement distribution of yi given  p   p , which is given by: E p 1   i | yi  f p yi |  p  (89) f c yi |  p   Ep 1  i  p



Also, we have: Ec yi |  p  

and



E p 1   i  yi |  p  E p 1   i  |  p 



Es wi  1 yi |  p  Es wi  1 |  p 

 q p  f c  yi |  p d p    p mc  y i     q p  f c  yi |  p    p

(90)

if  p is continuous

(91)

if  p is discrete

  q  E 1   | y  f  y |   q  | y   m  y E 1   |   q  E 1   | y  f  y |    E 1   |  

Hence, qc  p | yi can be written as: p

c

p

p

i

i

c

p

i

p

i

p

p

i

i

p

Let

p

i

i

i

p

i

(92)

p

p

  y p   y1 ,  , y n , y n 1 ,  y N   y s , y c  ,

 y s   y1 ,, y n 

where

and

 y c   yn1 ,, y N  . Let T  T y p  be the finite population quantity to be predicted. For N example: the finite population total T  y . Let Tˆ  y ,, y   Tˆ y  define the



i 1

i

1

n

s

predictor of T y p  based on O . The loss involved in predicting T y p  by Tˆ y s  is L T y p , Tˆ  ys  . For given  p   p and y s , and squared error loss function, the Bayes





predictor of T y p  is: Tˆ y s   E p T y p  y s 

(93)

Now we consider the following:  N    E p T y p  y s   E p  yi y s   E p  yi   yi  y s  ic  i 1     is

(94)

  yi   Ec  yi y s  is

ic

Thus, the estimate of the population total, for a fully Bayesian method under squared error loss, requires the posterior predictive expectation E c  y i y s  for prediction of Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

347

Abdulhakeem A.H. Eideh

yi , i  c  s given y s . This posterior expectation, under informative sampling design, can be obtained by the iterated expectations: (95) Ec  yi y s   E s Ec yi y s , p y s

 

 

where E s is the expectation with respect to the sample posterior distribution of  p given y s , and Ec is the expectation with respect to the sample complement of the posterior predictive distribution of y i , i  c given y s .

Illustration 1: Case 4 in Section 5. Superpopulation model: Let yi |  p   p ~ N  p ,  2 be independent random variables, i  1,..., N , where  2  0 is









known, i  1,..., N . Suppose that E p  i yi   exp a1 yi  . Then the sample model is:

yi |  s   s ~ N  s ,  2 , where  s   p  a1 2 . Suppose that the prior distribution of  s 2  2  is q s   1, -    s   . Then  p y s ~ N  y s  a1 , n 

 

  . 

 

Computation of Ec  yi y s   E p Ec yi y s , p y s : Since y i , i  c is independent of y s , therefore E p 1   i  yi |  p  Ec y i y s ,  p  Ec yi  p  E p 1   i  |  p 









(96)

After some algebra, and noting that E p  i yi |  p   E p E p  i yi | yi , p  we can show that:





Ec yi y s , p   p  a1 2

where



M p a1 

M p a1   exp a1 p  0.5a12 2



(97)

1  M p a1 





is

the

moment

generating

function

of

yi |  p   p ~ N  p ,  2 . so that,

 M p a1      Ec  yi y s   Es Ec yi y s ,  p y s  Es  p  a1 2  ys  1  M p a1     

 

 

(98)

  M p a1      Es  p y s   a1 2 Es   ys   1  M p a1   

Now, and

E s  p y s   y s  a1 2

(99)



1 2  2 348



  exp a1 p  0.5a12 2  M p a1     E s    ys    2 2  1  M p a1    1  exp a1 p  0.5a1 









 n exp   p  y s  a1 2 2  2 n



2

  d p 

(100)

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

Using Taylor series expansion of M p a1  1  M p a1  around  p  y s  a1 2 , we can show that, approximately:  M p a1    exp a1  y s  a1 2   0.5a12 2  E s  y  (101)  s 2 2 2 1  M p a1    1  exp a1  y s  a1   0.5a1   Hence, Ec  y i y s   y s  a1 2  

exp a1 y s  0.5a12 2  1  exp a1 y s  0.5a12 2 

(102)

Thus, for given a1 ,  2  the Bayes predictor of T  i 1 yi is: N

Tˆ y s   E p T y p  y s 





 exp a1 y s  0.5a12 2    y i    y s  a1 2  a1 2  1  exp a1 y s  0.5a12 2  is ic 













 exp a1 y s  0.5a12 2   ny s  ( N  n) y s  ( N  n)a1 1  2 2   1  exp a1 y s  0.5a1   2









(103)

 exp a1 y s  0.5a12 2   Ny s  ( N  n)a1 1  2 2   1  exp a1 y s  0.5a1   2





Note that if a1  0 , that is, the sampling design is noninformative, then the Bayes predictor of T  i 1 yi is the classical one obtained under design-based and frequentist N

model-based schools. Furthermore, the Bayes predictor of T  i 1 yi under informative N

sampling

adjust the classical 2 2   exp a1 ys  0.5a1  ( N  n)a1 2 1  . This 2 2   1  exp a1 ys  0.5a1  









predictor adjusted

by term

the depends

term on

the

informativeness parameter a1 . Bayes prediction of finite population total of the other (Cases1,2,3) considered in Sections 5 can be treated similarly.

 

Illustration 2: Case 2 in Section 6. Let yi |  p   p ~ Ber  p be independent random variables, where  p  0,1 , i  1,..., N . Suppose that the sample inclusion probabilities have expectations: if y i  0  exp a0    0 (104) E p  i | y i    exp a 0  a1   1 if y i  1 Recall that the sample distribution of yi is

f s  yi  s    s  i 1   s  y

1 yi

, yi  0,1

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

(105) 349

Abdulhakeem A.H. Eideh

where

s 

 p 1

 p 1  1   p  0

(106)

As in Illustration 1, the Bayes predictor of T  i 1 yi depends on the computation of: N

 

 

E c  y i y s   E s E c y i y s ,  p y s . After some algebra, we can show that:





Ec yi y s , p 

so that,

1  1  p 1  1    0  1  p

(107)

   1  1  p   E c  y i y s   E s E c y i y s ,  p y s  E s   ys    1  1    0  1  p   

 

 

We know that, approximately  0   ny s  E s  p y s    0   ny s   1 n  ny s   

(108)

(109)

  1  1  p  0   ny s  Expand  , we can show  around  p   0   ny s   1 n  ny s     1  1    0  1  p  that, approximately:    0   ny s    1  1     1  1  p  0   ny s   1 n  ny s      (110) E s   ys      0   ny s   1  1    0  1  p    1  1    0  1    0   ny s   1 n  ny s     Hence, approximately:    0   ny s   1  1   0   ny s   1 n  ny s      Ec  yi y s   (111)    0   ny s   1  1    0  1         n y   n  n y   s 1 s  0 

Thus, given  ,  ,  0 , 1  the Bayes predictor of T  i 1 yi is: N

Tˆ y s   E p T y p  y s 

     0   ny s   1  1     0   ny s   1 n  ny s       (112)    yi        0   ny s  is ic    1  1    0  1      0   ny s   1 n  ny s          0   ny s          n y   n  n y   0 s 1 s   

 ny s  ( N  n )

350

1  1 

  0   ny s          n y   n  n y   s 1 s  0  

1  1    0  1 

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

Note that if 1   0 , that is, the sampling design is noninformative, then the Bayes predictor of T  i 1 yi becomes: N

  ny s Tˆ y s   E p T y p  y s   ny s  ( N  n)   n which is the classical Bayes predictor obtained under model-based school.

(113)

9. Conclusion In this paper we introduced the posterior sample and complement-sample distributions under informative sampling, and use them in the analysis of survey data - developing Bayes predictors of finite population totals and estimating the parameters that characterize the superpoulation normal and binomial models. We have seen that several familiar estimators and predictors in common use are shown to be special cases of the present theory, thus providing them a new interpretation. Also, we have shown that many simple Bayesian analyses based on noninformative prior under informative sampling distributions give similar results to standard non-Bayesian approaches under informative sampling. Furthermore, for large sample size n , for example, the Bayes estimate of  p   p , shrinkage to ys  a1 2 which is the MLE (or Bayes estimator) under the sample distribution. That is, as the sample size n increases, the influence of the prior distribution on sample posterior inferences under informative sampling decreases. The main features of the present estimators and predictors are their behaviours in terms of the informativeness parameters. Also the use of the Bayes estimators and predictors that ignore the informative sampling design yields biased Bayes estimators and predictors. That is, ignoring the sampling design by not using the weights in the Bayesian statistical inference of survey data can seriously bias the results of the analysis, leading to erroneous conclusions. So, we should incorporate survey design when making survey analytic inferences. Finally, for simplicity (mathematically handle), and to makes life easy, choose a prior distribution for  s - the parameters that characterize the sample distribution, and then derive the Bayes estimator for  p - the parameters that characterize the superpoulation model, via the relationship between  p and  s is more convenient than choose a prior distribution for  p . I hope that the present theory and the posed approach in this paper, will encourage survey statisticians for further investigations for this vital topic of Bayesian statistical inference from sample survey data. Acknowledgment The author is grateful to professor Ray Chambers for his comments and interest.

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

351

Abdulhakeem A.H. Eideh

References 1.

Chambers, R. and Skinner, C. (2003). Analysis of Survey Data. New York: John Wiley.

2.

Eideh, A. H. (2003). Estimation for Longitudinal Survey Data under Informative Sampling, PhD Thesis, Department of Statistics, Hebrew University of Jerusalem. Israel.

3.

Eideh, A. H. and Nathan, G. (2006a). The analysis of data from sample surveys under informative sampling. Acta et Commentationes Universitatis Tartuensis de Mathematica. Tartu 2006. 10, 41-51.

4.

Eideh, A. H. and Nathan, G. (2006b). Fitting Time Series Models for Longitudinal Survey Data under Informative Sampling. Journal of Statistical Planning and Inference 136, 3052-306.

5.

Eideh, A. H. and Nathan, G. (2009). Two-Stage Informative Cluster Sampling with application in Small Area Estimation. Journal of Statistical Planning and Inference. 139, pp 3088-3101.

6.

Eideh, A.H. (2012a). Fitting Variance Components Model and Fixed Effects Model for One-Way Analysis of Variance to Complex Survey Data. Communications in Statistics – Theory and Methods, 41, pp 3278-3300.

7.

Eideh, A.H. (2012b). Estimation and Prediction under Nonignorable Nonresponse via Response and Nonresponse Distributions. Journal of the Indian Society of Agriculture Statistics, 66(3) 2012, pp. 359-380.

8.

Gelman, A., Carlin, J.B., Stern, H.S., and Rubin, D.B. (2009). Bayesian Data Analysis (second edition). CHAPMAN & HALL/CRC. New York.

9.

Kim, D.H. (2002). Bayesian and Empirical Bayesian Analysis under Informative Sampling. Sankhya, Volume 64, Series B, Pt.3, pp.267-288.

10.

Little, R.J. (2012). Calibrated Bayes, an Alternative Inferential Paradigm for Official Statistics. Journal of Official Statistics, Vol. 28, No. 3, 2012, pp. 309–334

11.

Nandram B., Choi, J.W., Shen, G., and Burgos, C. (2006). Bayesian predictive inference under informative sampling and transformation. Appl. Stochastic Models Bus. Ind., 2006; 22:559–572.

12.

Pfeffermann, D. and Sverchkov, M.(1999). Parametric and semi-parametric estimation of regression models fitted to survey data. Sankhya, 61, B, 166-186.

13.

Pfeffermann, D. and Sverchkov, M.(2003). Fitting Generalized Linear Models under Informative Probability Sampling. In: Analysis of Survey Data. (Eds. R. Chambers and C. J. Skinner). New York: Wiley, pp. 175-195.

14.

Pfeffermann, D., Krieger, A.M, and Rinott, Y.(1998). Parametric distributions of complex survey data under informative probability sampling. Statistica Sinica 8,: 1087-1114.

15.

Pfeffermann, D., Krieger, A.M, and Rinott, Y.(1998).. Parametric distributions of complex survey data under informative probability sampling. Statistica Sinica 8: 1087-1114.

352

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

Bayes Estimation and Prediction under Informative Sampling Design

16.

Pfeffermann, D.: Modelling of complex survey data: why is it a problem? How should we approach it? Survey Methodology. 37(2), 115-136 (2011).

17.

Rao, J.N.K. (2011). Impact of Frequentist and Bayesian Methods on Survey Sampling Practice: A Selective Appraisal. Statistical Science, 26, 240–256.

18.

Savitsky, T.D. and Daniell Toth, D. (2016). Bayesian estimation under informative sampling. Electronic Journal of Statistics. Vol. 10, 1677–1708.

19.

Skinner, C.J., Holt, D., and Smith, T.M.F (Eds.), 1989. Analysis of Complex Surveys, New York: Wiley.

20.

Sverchkov, M. and Pfeffermann, D. (2004). Prediction of finite population totals based on the sample distribution. Survey Methodology, 30, 79-92.

Pak.j.stat.oper.res. Vol.XIII No.2 2017 pp327-353

353

Suggest Documents