A Bayesian Approach to Multivariate H-spline Nonparametric regression

A Bayesian Approach to Multivariate H-spline Nonparametric regression Ronaldo Dias UNICAMP - Universidade de Campinas

Dani Gamerman UFRJ - Universidade Federal do Rio de Janeiro Resumo

1

introduction

Several methods were suggested to estimate non-parametrically an unknown regression curve f by using splines since the pioneer work of Craven and Wahba (1979). Kimeldorf and Wahba (1970) and Wahba (1983) gave an attractive Bayesian interpretation for an estimate fˆ of the unknown curve f . They showed that fˆ can be viewed as a Bayes estimate of f with respect to a certain prior on the class of all smooth functions. The Bayesian approach allows one not only to estimate the unknown function, but also to provide error bounds by constructing the corresponding Bayesian confidence intervals (Wahba 1983). In addition, Wahba noted that the coverage probabilities do not hold at each individual point, but rather are valid when averaged across the entire curve. Moreover, the true coverage could fall far short of the nominal level at points where there is unusual large local curvature. Nychka (1988) pointed out that the bias increases in those regions and the increase in bias is due to the global value of the smoothing parameter, which is appropriate on the average across all the points, but does not adapt to the local behavior of the function in regions of high curvature, where the polynomial spline with global smoothing parameter tends to over-smooth. Traditionally, there have been two techniques to address the problem of spatial adaptability. One uses local variable smoothing parameters (or bandwidths) in common smoothing methods, e.g., smoothing spline and kernel methods. In the smoothing splines techniques, in general, the number of basis functions is set to be as large as the number of observations and a smoothing parameter is chosen to control the trade-off between adaptability to data and smoothness enforced by a penalty term. Another technique is to place knots adaptively in a regression spline method which is equivalent to adaptively choose a set of spline basis functions for regression. Abramovich and Steinberg (1996), Luo and Wahba (1997) and Dias (1999) proposed more adaptive methods to obtain good estimates of the regression curves. The h-splines method introduced by Dias (1998) in the case of non-parametric density estimation, and by Luo and Wahba (1997) and Dias (1999) in the context of non-parametric regression, combines ideas from regression splines and smoothing spline methods by finding the number of basis functions and the smoothing parameter iteratively. One advantage of the hsplines method is the ability to vary the amount of smoothing in response to the inhomogeneous curvature of the true functions at different locations.

1

In this paper we extend the work of Dias and Gamerman (2002) by proposing a multivariate set of smoothing parameters that varies according to prior information.

2

Bayesian Approach

Consider the following regression model: y = f (x) + ,

(1)

where the vector of error ∼ N (0, σ 2 I), x = (x1 , . . . , xp )t and y = (y1 , . . . , yn )t . Assume that, for i = 1, . . . , n, p X f (x1i , . . . , xpi ) = fj (xji ), (2) j=1

PK

such that fj (xji ) = l=1 θlj Bl (xji ), where Bl ’s are the well known B-splines basis. Thus, rewriting f by using matrix notation, we have fj = X j θ j

j = 1, . . . , p,

(3)

where the entries of the matrix Xj is (Xj )lm = Bl (xjlm ) for l = 1, . . . , Kj , m = 1, . . . , n and θj = (θ1j , . . . , θKj )t for j = 1, . . . , p. Consequently, the likelihood for the parameter vector (θ, σ 2 , K) with θ = (θ1 , . . . , θp ) and K = (K1 , . . . , Kp )is p

2

Ly (θ, Kj , σ ) =∝ σ

−n/2

X 1 Xj θj ||2 } exp{− 2 ||y − σ

(4)

l=1

The usual Bayesian approach to a nonparametric regression model is to consider a prior distributions on f (see, (Silverman and Green 1994)) as, Z λ p(f ) ∝ exp{− (f 00 )2 }. 2 Pp Pp Since we assume that f = j=1 fj = j=1 Xj θj one might use, similarly to (Dias and Gamerman 2002) the following independent conditional priors: p(θj |Kj , λj ) ∝ exp{− where Ωj is a matrix with entries (Ωj )lm = can assume independent priors such as

R

λj t θ Ωj θj }, 2 j

(5)

00 00 Blj (u)Bmj (u)du for j = 1, . . . , p. In general, we

− (θj |Kj , λj ) ∼ N (µj , λ−1 j Ωj ),

where Ω− j is the generalized inverse of Ωj and the joint prior distribution given by p(θ, Kj , λ) =

p Y

p(θj |Kj , λj )p(λj |Kj )p(Kj )

j=1

with p(λj |Kj ) = ψ(Kj ) exp(−λj ψ(Kj )), 2

(6)

with the function ψ is chosen according to (Dias and Gamerman 2002). In addition, p(Kj ) =

exp(−aj )aK j /Kj ! 1 − exp(−aj )(1 − q ∗ )

Kj = 1, . . . , Kj∗ ,

where Kj∗ is the maximum number of knots allowed for each j = 1, . . . , p, and q ∗ = For σ 2 we take inverse Gamma distribution IG(u, v). The complete likelihood can be written as K

Ly (θ1K1 , . . . , θ2 p , K1 , . . . , Kp , σ 2 ) ∝ (σ)(−n/2) exp{

P∞

Kj +1

−Q(θ) } 2σ 2

alj /l!.

(7)

where Q(θ) = ||y − Xj θj ||2 for j = 1, . . . , p

3

Sampling from the Posterior

This is section where we’ll present a sampling scheme from the posterior based on reversible jump MCMC as in Green (1995). For this, observe that the posterior distribution is given by π(θ, λ, K, σ 2 ) ∝ Ly (θ, K, σ 2 ) ×

p Y

p(θiKi |λi , Ki ) ×

i=1

p Y i=1

3

p(λi |Ki ) ×

p Y i=1

p(Ki )p(σ 2 )

(8)

Referˆ encias Abramovich, F. and Steinberg, D. M. (1996). Improved inference in nonparametric regression using Lk -smoothing splines, J. Statist. Plann. Inference 49(3): 327–341. Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions, Numerische Mathematik 31: 377–403. Dias, R. (1998). Density estimation via hybrid splines, Journal of Statistical Computation and Simulation 60: 277–294. Dias, R. (1999). Sequential adaptive non parametric regression via H-splines, Communications in Statistics: Computations and Simulations 28: 501–515. Dias, R. and Gamerman, D. (2002). A Bayesian approach to hybrid splines nonparametric regression, Journal of Statistical Computation and Simulation. 72(4): 285–297. Green, P. J. (1995). Reversible jump Markov Chain Monte Carlo computation and bayesian model determination, Biometrika 82: 711–732. Kimeldorf, G. S. and Wahba, G. (1970). A correspondence between Bayesian estimation on stochastic processes and smoothing by splines, The Annals of Mathematical Statistics 41: 495– 502. Luo, Z. and Wahba, G. (1997). Hybrid adaptive splines, Journal of the American Statistical Association 92: 107–116. Nychka, D. (1988). Bayesian confidence intervals for smoothing splines, J. Amer. Statist. Assoc. 83(404): 1134–1143. Silverman, B. W. and Green, P. J. (1994). Nonparametric Regression and Generalized Linear Models, Chapman and Hall (London). Wahba, G. (1983). Bayesian “confidence intervals” for the cross-validated smoothing spline, JRSS-B, Methodological 45: 133–150.

4