R Robust Regression Estimation in Generalized Linear Models. Heritier S, Ronchetti E ( ) Robust bounded-influence tests in general parametric models.
R
Robust Regression Estimation in Generalized Linear Models
Heritier S, Ronchetti E () Robust bounded-influence tests in general parametric models. J Am Stat Assoc :– Heritier S, Cantoni E, Copt S, Victoria-Feser M-P () Robust methods in biostatistics. Wiley, Chichester Huber PJ () Robust estimation of a location parameter. Ann Math Stat :– Huber PJ () A robust version of the probability ratio test. Ann Math Stat :– Huber PJ () Robust confidence limits. Z Wahrsch Verwandte Geb :– Huber PJ () Robust statistics. Wiley, New York Huber PJ, Ronchetti EM () Robust statistics, nd edn. Wiley, New York Huber PJ, Strassen V () Minimax tests and the Neyman-Pearson lemma for capacities. Ann Stat :–, :– Markatou M, Ronchetti E () Robust inference: the approach based on influence functions. In: Maddala GS, Rao CR (eds) Handbook of Statistics, vol . North Holland, Amsterdam, pp – Maronna RA, Martin RD, Yohai VJ () Robust statistics: theory and methods. Wiley, New York Ronchetti E () Robustheitseigenschaften von Tests. Diploma Thesis, ETH Zürich, Switzerland Ronchetti E () Robust testing in linear models: The infinitesimal approach. PhD Thesis, ETH Zürich, Switzerland Rousseeuw PJ, Ronchetti E () The influence curve for tests. Research report . Fachgruppe für Statistik, ETH Zürich, Switzerland Schrader RM, Hettmansperger TP () Robust analysis of variance based upon a likelihood ratio criterion. Biometrika :– Tukey JW () A survey of sampling from contaminated distributions. In: Olkin I (ed) Contributions to probability and statistics. Stanford University Press, Stanford, pp –
Robust Regression Estimation in Generalized Linear Models Nor Aishah Hamzah , Mohammed Nasser Professor University of Malaya, Kuala Lumpur, Malaysia Professor University of Rajshahi, Rajshahi, Bangladesh The idea of generalized linear models (GLM) generated by Nelder and Wedderburn () seeks to extend the domain of applicability of the linear model by relaxing the normality assumption. In particular, GLM can be used to model the relationship between the explanatory variable, X, and a function of the mean, μ i , of a continuous or discrete responses. More precisely, GLM assumes that g(μ i ) = p η i = ∑j= xij β j , where β = (β , β , . . . , β p )T is the p-vector of unknown parameters and g(⋅) is the link function that determines the scale on which linearity is assumed. Models
of this type include logistic and probit regression, Poisson regression, linear regression with known variance, and certain models for lifetime data. Specifically, let Y , Y , . . . , Yn , be n independent random variables drawn from the exponential family with density (or probability function) f (yi ; θ i , ϕ) = exp {
yi θ i − b(θ i ) + c(yi , ϕ)} a(ϕ)
()
for some specific functions a(⋅), b(⋅), and c(⋅, ⋅). Here, ′ ′′ E(Yi ) = μ i = b (θ i ) and var(Yi ) = b (θ i )a(ϕ) with usual notation of derivative. The most common method of estimating the unknown parameter, β, is that of maximum likelihood estimation (MLE) or quasi-likelihood methods (QMLE), which are equivalent if g(⋅) is the canonical link such as the logit function for the logistic regression, the log function for Poisson regression, or the identity function for the Normal regression. That is, when g(μ i ) = θ i , the MLE and QMLE estimator of β are the solutions of the p-system of equations: n
∑(yi − μ i )xij = ,
j = , . . . , p.
()
i=
The estimator defined by () can be viewed as an M-estimator with score function ψ(yi ; β) = (yi − μ i )xi
()
where xi = (xi , xi , . . . , xip )T . Since the score function defined by () is proportional to x and y, the maximum possible influence in both the x and y spaces are unbounded. When y is categorical, the problem of unbounded influence in x remains and in addition, the breakdown possibility by inliers arises (Albert and Anderson ). As such, the corresponding estimator of β based on () is therefore non-robust. Any attempt to improve the estimation of such β should limit such influences. Two basic approaches are usually employed in order to address the problems stated above, that is: (a) diagnostics and (b) robust estimation.
Diagnostic Measures In most diagnostics approaches, the MLE is first employed and subsequently diagnostics tools are used to identify potential influential observations. For details on diagnostic measures, readers are referred to the published works of Pregibon (, ), McCullagh and Nelder (), Johnson (), Williams (), Pierce and Schafer (), Thomas and Cook (), and Adimari and Ventura ().
Robust Regression Estimation in Generalized Linear Models
While these techniques have been quite successful in identifying individual influential points, its generalization to jointly influential points cannot guarantee success. The development of a robust method in the early s provides an option that offers automatic protection against anomalous data. A recent trend in diagnostic research is (a) to detect wild observations by using the classical diagnostic method after initially deploying the robust method (Imon and Hadi ) or (b) to use robust method in any case (Cantoni and Ronchetti ; Serigne and Ronchetti ).
Robust Estimation Since the score function in () is subject to influence of outlying observation, both in the X and y, appropriate robust estimations are those of the GM-estimates. These include the Mallows-type (Pregibon ) and Schweppetype (Stefanski et al. ; Künsch et al. ). The proposed methods are discussed here. Let ℓ(θ i , yi ) = log f (yi ; θ i , ϕ) =
yi θ i − b(θ i ) + c(yi , ϕ) () a(ϕ)
and define the i-th deviance as di = di (θ i ) = {ℓ(θ˜i , yi ) − ℓ(θ i , yi ), where θ˜i is the MLE based on observation yi ′ alone, that is, θ˜i = (b )− (yi ). The deviance di can be interpreted as a measure of disagreement of the i-th observation and the fitted model. Thus, MLE that aims at maximizing the likelihood function also aims at minimizing the n
deviances, specifically minimizing M(β) = ∑ di (θ). i=
In an attempt to robustify the MLE, the first modification of the MLE introduced by Pregibon () is to replace the minimization criterion with M(β) = ∑ni= ρ(di ). The function ρ(⋅) acts as a filter that limits the contribution of extreme observations in determining the fits to the data. Minimizing the criterion above can be obtained by finding the root solutions to the following score function n
n
i=
i=
∑ ψ(di ) = ∑ wi si xij = ,
j = , . . . , p,
()
with si = ∂ℓ(θ i , yi )/∂η i , and wi ( ≤ wi ≤ ) given by w(di ) = ∂ρ(di )/∂di . Note that this is simply the weighted version of the maximum likelihood score equations with data-dependent weights.
obtain a class of Mallows M-estimates. This type of estimation is resistant to poorly fitted data, but not to extreme observations in the covariate space that may exert undue influence on the fit.
Schweppe-Type GM Estimate Extending the results obtained by Krasker and Welsch () and Stefanski et al. (), Künsch et al. () proposed bounded influence estimators that are also conditionally Fisher-consistent. Subject to a bound b on the measure of sensitivity γψ (γψ ≤ b < ∞), the following modification to the score function was proposed: ψ BI = {y − μ − c (xT β,
b )} (xT B− x)/
wb (∣r(y, x, β, B)∣(xT B− x)/ )xT where c(⋅, ⋅) and B are the respective bias-correction term and dispersion matrix chosen so that the estimates are conditionally Fisher-consistent with bounded influence, with weight function of the form wb (a) = min {, b/a} based on Huber’s loss function. As in Schweppe-type GM estimates, wb (⋅) downweight observations with a high product of corrected residuals and leverage. Details on the terms used here can be found elsewhere (see, e.g., Huber () on infinitesimal sensitivity). Besides the general approach in robust estimation in GLM several researchers put forward various other estimators for specific case of GLM. For example, when y follows a Gamma distribution with log link function, Bianco et al. () considered redescending M-estimators and showed that the estimators are Fisher-consistent without any correction term. In the logistic model, Carrol and Pederson () proposed weighted MLE to robustify estimators, Bianco and Yohai () extended the work of Morgenthaler () and Pregibon () on M-estimators while Croux and Haesbroeck () developed a fast algorithm to execute Bianco–Yohai estimators. Gervini () presented robust adaptive estimators and recently Hobza et al. () opened a new line proposing robust median estimators in logistic regression (see also Hamzah ). The robust Poisson regression model (RPR) (see Poisson Regression) was proposed by Tsou () for the inference about regression parameters for more general count data; here one need not worry about the correctness of the Poisson assumption.
Mallows-Type GM Estimate Based on Huber’s loss function, the corresponding weight function wi = min{, (H/di )/ } with adjustable tuning constant H, which aims at achieving some specified efficiency, can be used (Pregibon ). By solving (), one can
R
Cross References Generalized Linear Models Influential Observations Outliers
R
R
Robust Statistical Methods
Regression Diagnostics Robust Statistics
References and Further Reading Adimari G, Ventura L () Robust inference for generalized linear models with application to logistic regression. Stat Probab Lett :– Albert A, Anderson JA () On the existence of maximum likelihood estimates in logistic regression models. Biometrika ():– Bianco A, Yohai V () Robust estimation in the logistic regression model. In: Rieder H (ed) Robust statistics, data analysis, and computer intensive methods. Lecture notes in statistics, vol , Springer, New York, pp – Bianco AM, Garcia Ben M, Yohai VJ () Robust estimation for linear regression with asymmetric error. Can J Stat : – Carrol RJ, Pederson S () On robustness in logistic regression model. J Roy Stat Soc B :– Cantoni E, Ronchetti E () Robust inference for generalized linear models. J Am Stat Assoc :– Croux C, Haesbroeck G () Implementing the Bianco and Yohai estimator for logistic regression. Comput Stat Data Anal :– Gervini D () Robust adaptive estimators for binary regression models. J Stat Plan Infer :– Hobza T, Pardo L, Vajda I () Robust median estimator in logistic regression. J Stat Plan Infer :– Hamzah NA () Robust regression estimation in generalized linear models, University of Bristol, Ph.D. thesis Huber PJ () Robust Statistics. Wiley, New York Imon AHMR, Hadi AS () Identification of multiple outliers in logistic regression. Commun Stat Theory Meth (): – Johnson W () Influence measures for logistic regression: Another point of view. Biometrika :– Krasker WS, Welsch RE () Efficient bounded-influence regression estimation. J Am Stat Assoc :– Künsch H, Stefanski L, Carroll RJ () Conditionally unbiased bounded influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc :– McCullagh P, Nelder JA () Generalized Linear Models. Chapman and Hall, London Morgenthaler S () Least-absolute-deviations fits for generalized linear models. Biometrika :– Nelder JA, Wedderburn RWM () Generalized Linear Models. J Roy Stat Soc A :– Pierce DA, Schafer DW () Residual in generalized linear model. J Am Stat Assoc :– Pregibon D () Data analytic methods for generalized linear models. University of Toronto, Ph. D. thesis Pregibon D () Logistic regression diagnostics. Ann Stat :– Pregibon D () Resistant fits for some commonly used logistic models with medical applications. Biometrics : – Serigne NL, Ronchetti E () Robust and accurate inference for generalized linear models. J Multivariate Anal :–
Stefanski L, Carroll RJ, Ruppert D () Optimally bounded score functions for generalized linear models, with applications to logistic regression. Biometrika :– Thomas W, Cook RD () Assessing influence on predictions from generalized linear models. Technometrics :– Tsou T-S, Poisson R () regression. Journal of Statistical Planning and Inference :– Williams DA () Generalized linear model diagnostics using the deviance and single case deletions. Appl Stat :–
Robust Statistical Methods Ricardo Maronna Professor University of La Plata and C.I.C.P.B.A., La Plata, Buenos Aires, Argentina
Outliers The following Table (Hand et al. : ) contains measurements of the speed of light in suitable units (km/s minus ) from the classical experiments performed by Michelson and Morley in .
We may represent our data as xi = μ + ui , i = , . . . , n
()
where n = , μ is the true (unknown) speed value and ui are random observation errors. We want a point estimate ̂ μ and a confidence interval for μ. Figure is the normal QQ-plot of the data. The three smallest observations clearly stand out from the rest. The central part of the plot is approximately linear, and therefore we may say that the data are “approximately normal.” The left-hand half of the following Table shows the sample mean and standard deviation (SD) of the complete data and also of the data without the three smallest observations (the right-hand half will be described below). Mean
SD
Median
MADN
Complete data
.
.
.
.
obs. omitted
.
.
.
.