This article was downloaded by: [Hong Kong Polytechnic University] On: 03 July 2012, At: 22:25 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Econometric Reviews Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lecr20
A bayesian approach to dynamic tobit models Steven X. Wei
a
a
Department of Finance, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Phone: (852)23587661 E-mail: Version of record first published: 21 Mar 2007
To cite this article: Steven X. Wei (1999): A bayesian approach to dynamic tobit models, Econometric Reviews, 18:4, 417-439 To link to this article: http://dx.doi.org/10.1080/07474939908800353
PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
ECONOMETRIC REVIEWS, 18(4), 41 7-439 (1 999)
A BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Steven X. Wei Department of Finance The School of Business and Management The Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong Email:
[email protected];Tel: (852)23587661
Key Words and Phrases: Bayesian inference; Dynamic Tobit model; The Gibbs sampler with the data augmentation; Monte Carlo simulation; truncated normal distribution
JEL Classification: C11, C24
ABSTRACT This paper develops a posterior simulation method for a dynamic Tobit model. The major obstacle rooted in such a problem lies in high dimensional integrals, induced by dependence among censored observations, in the likelihood function. The primary contribution of this study is to develop a practical and efficient sampling scheme for the conditional posterior distributions of the censored (i.e., unobserved) data, so that the Gibbs sampler with the data augmentation algorithm is successfully applied. The substantial differences between this approach and some existing methods are highlighted. The proposed simulation method is investigated by means of a Monte Carlo study and applied to a regression model of Japanese exports of passenger cars to the U.S. subject to a non-tariff trade barrier.
1
Introduction
Tobit (censored) models, introduced by Tobin (1958), are a class of limited dependent variable models. Since the early 1970's, static Tobit models have been extensively studied and widely used in economics, finance and other fields (for general references, see Amemiya (1984, 1985), Maddala (1987), Greene (1993), Chib (1992) and Geweke (1992)). In the literature, however, dynamic Tobit models have not gained much attention. The reason is that their likelihood functions are often analytically intractable (see the discussion by Poirier and Ruud (1988)).
Copyright O 1999 by Marcel Dekker, Inc
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Nonetheless, dynamic Tobit models are useful in various applications. For example, Zeger and Brookmeyer (1986) used a dynamic Tobit model to analyze air pollution data that are subject to lower limits of detection. In economics and finance, dynamic Tobit models are often appropriate when constraints or regulations are imposed. Peristiani (1994) adopted a simplified version of a dynamic Tobit model to study the behavior of individual bank borrowing at the (Federal Reserve's) discount window. Kodres (1988) proposed a dynamic Tobit model to detect the effect of price limits in currency futures markets. Finally, Zangari and Tsurumi (1996) applied a dynamic Tobit model to data on the Japanese exports of passenger cars to the U.S. subject to a non-tariff trade barrier. The export data used in that study are further explored in this paper. This paper offers a new method for the exact posterior analysis of dynamic Tobit models. Its primary contribution will be to develop a practical and efficient sampling scheme for the conditional posterior distribution of the censored (i.e., unobserved) data, so that the Gibbs sampler with the data augmentation method (cf. Tanner and Wong (1987) and Gelfand and Smith (1990)) can be successfully applied. In particular, we show that the latent data (conditioning on all parameters and observed data), viewed as a vector of special parameters, can be sampled from a group of truncated normal distributions. The means and variances of the normal distributions are analytically obtained and the information embedded in both the dynamic and censored structures of the model is fully taken into account. A decomposition of latent data into strings in terms of their probabilistic dependence over time as suggested by Zeger and Brookmeyer (1986) is used to facilitate our analysis. A few major advantages of this Bayesian approach are emphasized: (i) our method is used to handle uniformly both stationary and non-stationary dynamic Tobit models; (ii) it can be easily extended to deal with a Student-t version of dynamic Tobit models (with unknown degrees of freedom) though this paper only concentrates on dynamic Tobit models with Gaussian errors; (iii) it is computationally easier than its classical counterpart. While our approach focuses on drawing variates from a truncated normal distribution (see Proposition 3.2): its classical counterpart needs to simulate truncated multivariate normal probabilities and their derivatives with respect to unknown parameters (see Hajivassiliou, MacFadden and Ruud (1996) and Lee (1997)). Usually, sampling from a distribution is computationally easier and faster than evaluating a probability from the distribution; (iv) prior information, if any, can be formally incorporated into the process of making a statistical inference. Noninformative (e.g.. Jeffreys') prior can be similarly entertained. Linear constraints on the model parameters can also be taken into account easily in the Bayesian paradigm; and (v) the posterior moments of any function of the model parameters can be readily computed. To evaluate the performances of our method, a Monte Carlo study is conducted. We find that with even small sample sizes or non-stationary latent processes, our technique still delivers reasonably good results. In the literature, there exist a few relevant studies of dynamic Tobit models in addition to the ones we have mentioned. Dagenais (1982) and Zeger and Brookmeyer (1986) both explored the applicability of the Maximum Likelihood (ML) method to the Tobit models with autocorrelated errors. The ML method had not been attractive until simulation of the multivariate normal probabilities of (multi-dimensional) rectangles (see Hajivassiliou, McFadden and Ruud (1996)) was recently developed. Lee (1997) offered a study on the
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
419
estimation of dynamic Tobit models using one of the simulation methods - the GewekeHajivassiliou-Keane (GHK) simulator. The maximum pseudo-likelihood method proposed by Zeger and Brookmeyer is tractable but inefficient. The estimation method used by Kodres (1988) is both complex and inefficient (see the discussion by Morgan and Trevor (1997)). Recently, Zangari and Tsurumi (1996) (hereafter, ZT) considered three Bayesian procedures for Tobit models with AR(1) errors. There exist a number of major differences between our study and theirs. First, the ZT model is nested in the general case of the one presented here. Second, we develop a unifying approach to both stationary and nonstationary dynamic Tobit models while the ZT approaches are only confined to stationary ones and Laplace's approximation and quadrature numerical integration as they considered are only applicable to some simple cases of a dynamic Tobit model in practice. Finally, the ZT model failed a t running the Gibbs sampler with the data augmentation algorithm with their data in which there exists a relatively long latent string. ZT thus concluded that the Gibbs sampler with the data augmentation algorithm is not a reliable method for a dynamic and censored model. The results in this paper show that the Gibbs sampler with the data augmentation algorithm is a practical and useful method for dynamic Tobit models. We do not run into any computational problem even when using the ZT data. Therefore, unlike their claim, we strongly recommend using the Gibbs sampler with data augmentation to estimate dynamic Tobit models. The remainder of the paper is organized as follows. Section 2 introduces a dynamic Tobit model and points out its computational difficulty. Section 3 discusses the Bayesian estimation of the model and focuses on developing a sampling scheme of the latent data required by the data augmentation technique. A Monte Carlo study is conducted in Section 4 to demonstrate the performances of our method in various cases. Section 5 applies the method to a regression model of Japanese car exports to the U.S. with a non-tariff trade barrier. Section 6 concludes the paper.
2
The Model
Consider a dynamic Tobit model in which an observation yt is supposed to b e generated by
+ + Pk-pxk-pt + XIY;-~
Y; = Plxl t ' ' ' yt =max{yt,O},
. . . + Xp~;-p
+
Et
t = 1,2,...,T
(1)
where y,' is a latent process, the error term et follows i.i.d. N ( 0 , a 2 ) , z i t (1 5 i 5 k - p) is a covariate variable, and Pi (1 I i I k - p), Xi (1 5 i 5 p) and a2 are parameters.1 For notational convenience, we write p = (Dl, P2,. . . , Pk-prX I , . . . , Xp)', B = (p' , a-2)', yU = {ytly,'> 0}, y,* = {y;lyt 5 0 1 , Z = {yu,yF} a n d Y = {ytll 5 t 5 T}. Inwords, P i s the 'Other types of dynamic Tobit models exist. One of them is given in Maddala (1987) as
This model is different from our model (1) in that the dynamics are captured by lagged observed dependent variables rather than by lagged unobsemed dependent variables. This model can be used when censoring imposes an impact on the latent process of y:. The observability of the lagged dependent variables makes the estimation inference of this model similar t o that of a static Tobit model. Consequently, conventional (classical and Bayesian) approaches are still applicable t o this model.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
vector of mean parameters, 0 the vector of total parameters, y, the set of observed yf, y: the set of unobserved y f , Z the "augmented" data and Y the observed data. Note that p indicates the order of the AR process yf and k the total number of independent variables (including the lagged latent dependent variables). In addition, we assume that the initial p observations are uncensored.' This formulation of a dynamic Tobit model is fairly general. It allows part of the covariates xis to be the lagged terms of some other covariate variables. It is also not difficult to extend the threshold to, for example, a deterministic process Lt or to a deterministic band process [at,bt]. Note that using zero as the censoring threshold is simply a convenient normalization. Moreover, the distributional assumption of the error term et can be easily relaxed in the direction of the Student-t family.3 First consider that the ZT model is a special case of our model (1). To see why, let us rewrite their model in a slightly different form (they used Lt as the censoring limits)
where ut i.i.d. N(0, r 2 ) .After a quasi-difference transformation, the latent process yf u t , which is similar in form to model (1). can be rewritten as yt; = p y L l z i p - p&l In formulation ( I ) , censoring is implicitly assumed to be driven by sampling and thus imposes no impact on the latent process of yt; (it might be interesting to compare this model with the one in Footnote [I]). The difficulty involved in the estimation inference of this model stems from the fact that the sampling distribution is analytically intractable. For example, let f(y,, yzI0) denote the joint distribution of y, and y: and be indexed by the parameter vector 0.4 The sampling distribution of this model is
+
+
In the likelihood function, the data {ytlyt = 0 , l 5 t 5 T} appear a t the upper limits of the integrals. The total number of integrals in the expression equals the total number of censored observations in the sample. The exact form of this likelihood can be derived by following Zeger and Brookmeyer's (1986) derivation combined with the analytical results to be obtained in the next section. This likelihood function often involves high dimensional integrals due to the dependence of (latent) observations, the source of the analytical intractability. We explain this problem more carefully in the next section.
his assumption can be replaced by another one: there exists a t least one set of p consecutive uncensored observations (see footnote [6]). If there does not exist any set of p consecutive uncensored observations in model ( I ) , we must be cautious in applying t h e algorithm developed in this paper. For a stationary process y;, initializing t h e latent d a t a {ytly; 5 0, 1 5 t 5 p ) in t h e support of their distribution would still make t h e algorithm valid. For a non-stationary process yt, however, t h e algorithm is no longer reliable. T h e reason is t h a t t h e effect of initialization on t h e subsequent values of yt would not die down with time. Consequently, t h e initialization can substantially impact t h e model estimation. 3The ideas of this paper a n d Geweke (1993) can be easily combined t o handle a Student-t version of t h e dynamic Tobit model. 4For simplicity, t h e notation f(y,, y:) is used a bit sloppily. Here we need t o assume t h a t y, and y; follow their original time series order.
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
3
The Estimation
This section studies the Bayesian estimation of model (1). We use the well-known Gibbs sampler with the data augmentation method (see, for example, Tanner and Wong (1987) and Gelfand and Smith (1990)) and focus on developing a sampling scheme of the conditional distribution y,*l{Q,Y}. The Bayesian estimation involves manipulating density functions through Bayes' Theorem P ( W ) 4 W Q ; Y)
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
where T(Q)is the prior density of 8, C(Q;Y) the likelihood function (viewed as a conditional density of Y given Q) and p(QIY) the posterior density of 8. Under a quadratic loss function, the Bayesian point estimate of any function g(.) of the parameter vector 9 is
Due to the analytical intractability of the high dimensional integrals in the likelihood function, a direct evaluation of (4) is often difficult, if not impossible. We turn to the Gibbs sampler with the data augmentation approach to find the solution to it. Prior specification We consider both a normal-gamma prior and Jeffrey's prior (cf. Poirier (1995) and Zellner (1971)).~A normal-gamma prior of the parameter vector Q is given by
where {p, Q , s-2,1) are hyperparameters, $k(PIP, Q ) is a k-dimensional normal distribu- a 2-tion with mean p and variance a 2Q , and y(a-21s-2,2) is a gamma distribution with mean s-2 and degree of freedom 2. Jeffreys' prior is -
Conditional distributions of parameters Given the latent data y,*, model (1) collapses t o a dynamic linear regression model. The conditional distribution 0l{y,, y,'} is thus in the normal-gamma form under both priors (see, for example, Zellner (1971) and Poirier (1995)). For convenience, we summarize the main results of the conditional distribution 81{y,, y:). Given the latent data, the collapsed model (1) can be neatly written as a matrix form
where Z is the vector of the augmented data {y,, y,*} with their original time series order, X = ( x l , 2 2 , . . . , ~ k - ~y l, l %. . . , yTp) the T x k covariate matrix, U = (€1, € 2 , . . . , ET)', and y*, the ith-period lagged latent dependent variable. Under prior ( 5 ) , we have
'The random variables { P , o-') are said t o follow a normal-gamma distribution if distribution a n d a-2 a gamma distribution (see, for example, Poirier (1995, p. 128)).
is a normal
where
and
a-'/{0, Z }
y ( ~ - 2 / ~ * -u*) 2:
where
Similarly, under prior (6), we have
Z]
-
NK(/?.a 2 ( x ' X ) - ' )
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
and
o-~{$Z , }
N
7(CT-2/~*-2,
fi)
(10)
where the notations are the same as those in ( 7 ) and (8). The conditional distributions (7) - (10) are either normal or gamma and can be readily sampled.
Conditional distributions o f latent data We first define a term t o describe a decomposition of the latent data into certain subgroups. Given a sample of model ( I ) , a latent string is a subset of consecutive observations that begins with a set of p consecutive uncensored observations followed immediately by a censored observation and ends after the next set of p consecutive uncensored observations. In fact, a latent string can be viewed as the minimum. complete and probabilistic information unzt used to learn about the censored d a t a within this latent string. The following two examples are designed to elaborate the concept Example 3.1 In model ( 1 ) .we assume p = 2, and a sample of y, is the following
where vt denotes an uncensored observation and ct a censored one. There exist two latent 1. u l j ] . In other strings in this sample: { z L ~ Z,L R , c4. C S , u6, c?.cg. uy, u l O ]and { u l o ,~ 1 ~ 1 c12,u13, words, a latent string consists of two sets of p consecutive uncensored observations as its supporting ends. The number of consecutive uncensored observations between (not including) t,he two supporting ends must be less than p. It is noted that two censored strings might have a common or overlapped (uncensored) supporting end. However, the censorptl clprnents in two distinct latent strings can never intersect. This characteristic of latent strings is useful in deriving the conditional distributions of the latent data.
Example 3.2 A sample of yt from model ( 1 ) with p
-
1 is assumed to be the following
where the notations are the same as those in Example 3 1 In this example, we h a w three . 1 ~ None ~ ~ of )the uncensorcd latent strings { u 2 .c? cq. c j ,u 6 } . ( u 8 .C I ) . u l O }and { u l Oell,
423
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
observations stands between any two censored observations in any latent string. This is not true in general for p > 1 and suggests that latent strings in a higher order ( p > 1 ) dynamic Tobit model can be more complicated. Without loss of generality, we assume that the first and last sets of p consecutive ob~ in a static Tobit model, the servations in the sample of model ( 1 ) are u n ~ e n s o r e d .Unlike conditional distribution yEj{Q, Y ) may be inherently high dimensional in the dynamic and censored setting (1). If a large, or even modest, proportion of observations are censored, it might be reasonable to expect that some latent strings are high dimensional. Following Zeger and Brookmeyer (1986), we can show that an information-preserving dimension reduction of the distribution y,* I{O, Y ) holds.
Proposition 3.1 The density function of y,*I{O, Y } can be decomposed into the product of the joint density of the latent data in each latent string over all such strings.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Proof
See the Appendix.
This proposition guarantees that sampling the latent data from the multivariate distribution yr l{O, Y ) is equivalent to sampling the latent data separately from each latent string. Now consider a typical latent string { ~ t - ~ +, .i. . , ~t , ~ t ; l. . . , y L n t r Yt+nt+l , . . . , yt+nt+P) where nt is the number indexing the length of the string. From the specification of model ( I ) , the conditional distribution y h l , . . . ,Y:+,~, ~ t + ~ , + .l ., . , ~ t + ~ ~ + ~ tl -{ ~e +, .l .,. ,y t } is known to be in the multivariate normal form. Denote the mean and variance of the conditional distribution a it and f i t , respectively. Then they are given by
At =
Qt+l Qt+nt
-
Qt+nt+p
where k-P
Qt+T =
and
C
m=l
5,
t+rPm
+
P
L&t+r-u
T
= 1 , 2 , . . . ,n t
+p
v=l
For simplicity, we implicitly assume in (12) and (13) that if a subscript is less than 1, its associated term is zero. Also note that the terms associated with T - v < 1 imply
his assumption is not necessary and is used here to simplify the presentation. In fact, if the first p observations are not uncensored, it is supposed that one can find in the sample first p uncensored observations somewhere: {yt,,. . . ,ytp). The procedure discussed in this section requires a minor change: for t < t l , equation (12) needs to be changed into a forward-looking expression and equations (13)-(15) should be changed accordingly.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Figure 1. Switching Mapping
Qt+T-Z) = Y ~ + ~ + , , which are used to initialize the recursive computation of ( 1 2 ) . These results should be readily confirmed by iterating the latent process yt in ( 1 ) and using some simple algebra. Computing aij requires the following computational order since the former elements also serve as part of the inputs in calculating the latter ones
To use the data augmentation technique, we need to work out the distribution of the latent data in the latent string conditional on the parameter vector % and all the observables zn the stnng. It should be noted that a latent string may contain other uncensored observations between its supporting ends when p > 1 (see Example 3.1). From an efficiency point of view, the uncensored data in the latent string offer, through the dependence structure ( 1 3 ) , additional information about the latent data in the string. We consider rearranging the string { Y ~ + I.,.., y L n t , Yt+nt+l, . . . r ~ t + n t + p )into a new one {Gf+l, . . . I GLlr~ t + ~ + l , . . . , yt+nl+p} where the first 1 elements {?GI,. . . , &,) are censored and the rest {&+l+l, . . , yt+,+,} are uncensored. Formally, this can be done by defining a one-to-one swztchzng mapping S : ( 1 , 2, .... nt + p } -4 ( 1 , 2, ..., nt+ p}, which shifts all censored observations in the string into the first 1 positions and uncensored ones into the next nt p - 1 positions. Figure 1 shows such a mapping in a hypothetical case.
+
Again, c denotes a censored observation and u an uncensored one. The switching mapping S is a tool to reorganize a string containing latent data so that the conventional theory for the conditional multivariate normal distribution can be used. The mean and variance of the reorganized string { y h l , .. . , Gt+l+~. . . , ~ t + n , + are ~}
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
425
where Q ~ += ~Q t + p i (i) and ifZ3 = U S - ' (qs-1( 3 ) and S-I is the inverse of S . The latent 8, Yt+i+l, yt+i+2, . . . , ~ t + ~ , still + ~ )follow a normal data { & l , ~ t ; 2 %. . . . ~ t ; ~ } l {yt-p+i,...,yt. distribution. Suppose that the mean and variance of the conditional distribution G f , yt+l,
. . , $ + / { 8 . yt-p+~,...,yt,&+[+I, dt+i+z, . . . . 1 7 t + ~ , + ~are } At and f i t , which are derived from the conventional theory for the conditional multivariate normal distribution. Then we can summarize and extend the above discussion as follows.
.
Proposition 3.2 For a latent string { ~ t - ~ - l ., . , y t , Y ; + ~ ,. . . y h n L rY ~ + ~ , + I. ., . Y ~ + ~ , + ~ } in a sample of model ( I ) , the latent data in the string, reorganized as { y G l , ~ f + ~. . .. .chi} through the switching mapping S , follow a truncated normal distribution
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
with the support [O. -too)' So far, we have obtained the complete conditional distributions of the latent variables y,* and parameter vector 8 . The Gibbs sampler with the data augmentation algorithm can be run by iterating (7), (8) and (15) under prior (5), or (9), (10) and (15) under prior (6). Truncated normal distribution (15) can be sampled according to Geweke's (1991) method. The algorithm can be initialized at any point of the parameter space. By use of the Gibbs +M sampler with the data augmentation algorithm, the draws { 8 ( ' ) , y,* ( z ) }mE ~ with + ~some suitable values of m and M form a posterior sample of 8 and yr and are used to compute posterior means and standard deviations, or other posterior moments, of 6' and y,*. Note on comparison with the ZT approaches As mentioned before, the ZT model is a special case of our model. One difference between our approach and theirs is that we do not require the stationarity of the latent process yf while they do. This difference is reflected in the process of deriving the conditional distribution of the latent data in a latent string. To do so, they calculate the unconditional means and variances of the uncensored supporting ends, before applying the conventional theory for the conditional multivariate normal distribution (see pp 116 - 117 in ZT). Obviously, these unconditional means and variances exist only if the latent process y,* is stationary. All of their approaches begin with the likelihood function so that the stationarity condition for the latent process y t is required in all of their approaches. However, this stationarity condition is not necessary. Given a latent string, we first derive the distribution of its variables excluding and conditioning on the left supporting end. This strategy is simple because the only trick is to iterate the latent process yf in model (1) (see (11) - (13)). We then derive the conditional distribution of the latent data given all uncensored data in the string. In the whole process, we do not need the stationarity of the latent process y;. After the latent data are generated, we face a complete dynamic linear model, which is solved without the stationarity requirement in the Bayesian framework (see Zellner (1971)).
4
A Monte Carlo Study
A Monte Carlo study is conducted in this section to evaluate the practical performance of the proposed method. With the true data generating process (DGP) known, it is thus
possible to observe the adequacy of the method in alternative circumstances. The analysis is conducted with Jeffreys' prior (6). The existence of the posteriors can be confirmed by noting that the posterior of a complete dynamic linear model with Jeffreys' prior exists and a t least p initial observations are uncensored. The programs are written in Fortran-77 and run on a Silicon Graphics 4D-320 UNIX machine with intensive use of IMSL MathILibrary and StatILibrary. The regression D G P is designed through
where Dl = -1.0, AR(1) process
p2 = 1.0 and
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
xzt = cl
the sample size is T . The covariate x2 is generated by an
+ . S X ~ ,+~ c2t1t, -~
vt
i.i.d. N ( 0 , l )
where cl and c2 are adjustable to achieve various censoring levels. The experiments consist of two examples with sample sizes: T = 50, 100, and 200. Example 4.1 Consider model (16) with p = 1. In this case, X1 is designed to take values . l , .5, .9, 1.0 and 1.2. With this design, we intend to investigate how the proposed method responds to changes of XI. Note that XI = 1.0 is associated with a non-stationary (unit root) latent process y t and XI = 1.2 corresponds to an explosive latent process. Technically, we set c2 = 1 and adjust cl to achieve different censoring levels. Example 4.2 Set p = 2 in model (16). Design (XI, X2) = (.2, . I ) , (.5, . I ) , (.7, .3) and (.7, .4). Because latent strings in this example are more complicated than those in Example 4.1, we attempt in this design to show the implementation and performance of the proposed method in a more complicated framework. Note that (XI, X2) = (.7, .3) is associated with a unit root latent process y f and (XI, X2) = (.7, .4) corresponds to an explosive one. Technically, we set cg = 10 and adjust cl to achieve different censoring levels.
For each designed model ( i e . , fixing the parameter vector 8, sample size T, and the values of covariate xz), its "expectea" censoring level is taken as a measure of its un~bservability.~ This expected value, however, is not easy to evaluate analytically. Instead of a direct approach, we compute the estimated expected censoring level ( E E C L ) of each model by averaging the censored ratios of the model over various seeds in coding our program. The E E C L s are reported in Tables 1-4. In the implementation of our method, the first 2000 draws are discarded and the algorithm is to run another 5,000 draws. The convergence of each chain is detected by using the method of the visual inspection of CUMSUM statistic.' The convergence statistics are
ETz1
heor ore tic ally, the expected censoring level of model (16) equals t x Prob(the number of censored observation = t(Model) for a given sample size T. 8The CUMSUM statistic (see Yu and Mykland (1994)) of a Monte Carlo Markov chain (MCMC) 0(') for N draws is given by t
CSt =
( At Edi)-8)lir
for t=50,100,150 ,..,N
i=l
where 8 and d are the empirical mean and standard deviation of the N draws. If the MCMC converges, the plot of CSt against t should converge smoothly to zero. On t,he contrary long and regular excursions away from zero indicate an absence of convergence of the MCMC.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
427
not reported since the quality of our parameter estimates can be easily seen given that their designed values are known. The numerical standard errors are reported in square brackets according to Ripley (1987).' The batch size b is selected such that the first-order correlation between the batch means is at most .02. The one period lagged correlation of the Gibbs run is in curly brackets. The time spent on the estimation of each model varies and depends largely upon the model's degree of observability (i.e., its censoring level) and its sample size. For example, as EECL = 20% and T = 50, 27 seconds are spent for the estimation to Example 4.1, and about 1 minute for that to Example 4.2. If EECL increases to 80%, the computing time rises to 20 and 25 minutes, respectively. The estimation results of the two examples are reported in Tables 1-4. The posterior means and standard deviations are computed by use of the Rao-Blackwellization of the posterior draws (cf. Gelfand and Smith (1990)). For comparison purposes, three types of estimates are presented in each of the tables. The numbers just below the parameters are their designed values. Line a stands for OLS estimates with the data { y , " ) . This corresponds to the collapsed model ( I ) , i.e., a fully "observed" latent structure (this can be realized in a Monte Carlo study because the data are artificially generated and censored). Line b reports OLS estimates with the data { y t ) when censoring is ignored. Clearly this method is not right but here we can see how much the parameter estimates are distorted from this naive OLS procedure. Line c represents the parameter estimates with the censored data { y t ) when the developed method in this paper is employed. The standard deviation of each posterior mean is in brackets and the one-period lagged correlation of the Gibbs run is in curly brackets. From these tables, we can ascertain that our estimates are very close to their designed values. Comparing them with the richer information-based estimates in line a , we observe that the results in the two lines are close and thus conclude that our method performs very well. The batch standard errors are small and show that our estimates are reliable. O n the other hand, the OLS estimates with the censored data { y t ) perform poorly and become worse with the increase of censoring levels (see line b in each table). Part I of Tables 1 and 3 suggests that the OLS estimates in line b are largely distorted. Furthermore, the estimated a2 in line b is always associated with a much larger value than its designed one. Intuitively, this means that a linear (dynamic) model (ignoring the censoring) cannot fit (dynamic) censored data in general. We thus conclude that the OLS method is indeed inappropriate to deal with dynamic Tobit models. While the OLS estimates can easily be computed, here they serve to initialize the Gibbs sampler with the data augmentation algorithm, though this is not necessary. In addition, it is seen from Tables 1 and 3 that the lower a model's censoring level is, the better its estimates are. This "censoring level" effect is intuitive because the increase of "expected" censoring level results in more information loss of the data. Next, the estimated results regarding changes of XI and Xg are reported in Part I1 of Tables 1 and 3. With EECL = 50%, the Bayesian method works very well, simply by comparing the estimates in line c with those in line a. Part I of Table 1 and that of Table 4 suggest that our Bayesian estimates are consistent. When XI, X2 and the censoring level gThe batch mean method (cf. Ripley (1987)) is implemented as follows. Divide the posterior chain into b batches of length g. Denote the mean of each batch as m, and the average of the batches as 7TL. Then the 7TL)2. , standard error of the estimate is given by { b ( b - I)}-' ~ & ~ (-m
Table 1. Simulation Results (cf. Example 4.1)
Parameter + EECL 1
20%
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
50%
80%
I. Changes of E E C L (Sample Size = 50) PI P2 XI -1.0 1.0 0.5 a -1.57 (.30) 1.15 (.13) .49 (.06) -.76 (.25) .85 (.12) .56 (.06) b c -1.43 (.33) 1.05 (.14) .53 (.06) [.004] [.0014] [.0006] 1 {-18 {-.09) .52 (.06) a -1.18 ( 1 9 1.03 (.11) .07 (.14) .50 (.08) .56 (.07) b .53 (.06) c -1.29 (.32) 1.09 (.15) [.009] [.003] [.001] {.20) {-,061 .52 (.06) a -1.09 (.20) 1.00 (.09) .25 (.09) .23 (.05) .49 (.lo) b -.87 (.31) .90 (.20) .51 (.09) c [.012] [.009] [.002] {.lo) {.05} {-,011
g2
1.0 .97 (.20) .77 (.16) .87 (.21) [.002] {-.I21 1.01 (.21) .65 (.14) .92 (.27) [.004] {-.I21 1.03 (.21) .32 (.06) .97 (.35) [.007] {-.I71
11. Changes of XI (Sample Size = 50 ; E E C L =50%) Parameter
-+
Note: EECL denotes Estimated Expected Censoring Level. The numbers just below the parameters are their designed values. Line a refers to the OLS estimates with the data {y;) ( i . e . , the latent structure is "fully observed"); line b is t,he OLS estimates with the designed censored data {yt); and line c is the estimates of the censored data {yt) by use of the proposed method. Standard deviations are in brackets; the numerical st,andard error of each posterior mean is in square brackets (see footnote [9]); the one period lagged correlation of the Gibbs run is in curly brackets. All of the estimates here are based on m= 2000 and M = 5000.
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS Table 2. Simulation Results (cf. Example 4.1) I. Changes of Sample Size (EECL = 50%) Parameter + Sample Size 1
100
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
200
11. Large X1 and EECL (Sample Size = 100) PI P2 1 Parameter -+ EECL 1 -1.00 1.00 .90 a -1.08 (.15) .96 (.04) .90 (.01) b .17 (.07) .23 (.02) .82 (.04) 80% c -1.35(.28) 1.27(.22) 91(0) [.023] [.020] [.001] r.16) ~ 7 1 {.19)
(T
1.00 .85 (.12) .37 (.05) .87(.28) [.007] {-.01}
Note: See the note to Table 1 for notations.
are all high, our method still performs fairly well (see Part I1 of Tables 2 and 4), but at the cost of a large increase in computing time. Table 1 (11) and Table 3 (11) show that there is no problem at all with our method when the process yf is non-stationary.
5
An Application
One real-world example is given here to illustrate what kind of economic applications our study can handle. The data used in this example were also studied by ZT. They examine a demand model of the Japanese exports of passenger cars to the U.S., using annual and quarterly data from 1974 to 1992. The U.S. automakers suffered huge losses a t the end of the 1970s because of the gasoline crisis (see Tsurumi and Tsurumi (1983)). In the meantime, Japanese passenger car exports to the U.S. increased dramatically. As a compromise, the United States negotiated Voluntary Export Restraints (VERs) with Japan to curb Japanese car imports. The VERs took effect in 1981 with 1.68 million units per year. I t was raised to 1.85 in April 1984, further to 2.3 in April 1985, and then lowered to 1.65 in April 1992. The data indicate that VERs were binding during 1981 and 1986. The time series plot of the data is shown in Figure 2. The triangles in the plot indicate that the VERs were binding. As emphasized by ZT, although the VER was set annually, Japanese car dealers were subjected to monthly allocation quotas during the time in which the VERs were effective. Therefore, we assume that the VER quotas were binding for each quarter during these five years.
Table 3. Simulation Results (cf. Example 4.2)
I. Changes of EECL (Sample Size = 50) Parameter
EECL
+
PI
P2
A1
A2
o2
1
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
a
Parameter
(XI,A2)
5
1
2
(.7..3)
(.7,.4)
11. Changes of (A1, X2) (Sample Size = 50); EECL=50%) P2 XI A2 c2 + PI 1.0 1.0 -1.0 a -1.13 (.15) 1.01 (.02) .27 (.02) .12 (.02) 1.01 (.21) b 2.32(.76) .48(.06) .76(.12) -.13(.10) 19.41(4.05) c -1.30 (.26) 1.03 (.02) .26 (.02) .12 (.02) .83 (.28) [.008] [.0007] [.0005] [.0003] [.006] {.I71 {-,081 { W {.I41 {.311 a -1.11 (.16) 1.01 (.02) .68(.02) .32(.02) 1.02 (.21) b .31(.07) 1.28(.14) -.23(.15) .43(.89) 30.81(6.42) c -1.12(.17) 1.03(.02) .64(.03) .36(.03) .94 (.30) [.003] [.0004] [.0006] [.0006] [.006] {-,241 { - 0 {.06) {-,271 {.06) a -1.09(.18) 1.01(.02) .68(.02) .42(.02) 1.02 (.21) b .23(.07) 1.45(.14) -.37(.16) .15(.93) 28.45(5.93) c -1.29(.16) 1.02(.02) .69(.03) .41(.03) .69 (.21) [.003] [.0004] [.0005] [.0006] [.004] {-,051 { - 1 { - 3 {-,301 {-,031
Note: See the note to Table 1 for notations
The question addressed here is how to model the demand in the U.S. for Japanese cars when the observations are constrained by the quotas. First of all, it is necessary to take this VER (censored) effect into account since otherwise the estimates would be distorted, as discussed before. Next, t h e evidence from the d a t a shows that the demand is autocorrelated. Therefore, the data possess both censored and dynamic features, which makes the dynamic Tobit model an appropriate framework. To simplify the analysis, we assume that there is no effect of the VERs on an individual's demand decision making. This implies that VERs
43 1
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS Table 4. Simulation Results (cf. Example 4.2)
I. Changes of Sample Size (EECL = 50%; (XI, X2) = (.5, .2))
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Parameters Sample Size
1
I
1
a
Pi
P2
A1
A2
-1.0 -1.05 (.09)
1.0 1.00 (.01)
0.5 .48 (.01)
0.2 .22 (.01)
u2
1.0 3 0 (.12)
11. Large (XI, X2) and Large EECL (EECL=80%; Sample Size = 100) 4 PI P2 A1 A2 u2 Parameter (Xl, X2) 1 -1.0 1.0 1.0 -21 (.Ol) .82 (.12) a -1.22 (.14) 1.01 (.01) .68 (.01) 12.74 (1.84) .58 (.06) .61 (.06) .I7 (.06) b .22 (.58) .23 (.Dl) 6 2 (.12) c -1.11 (.14) 1.00 (.02) 5 ( 0 1 (.7, 4 [.0004] [.0002] 1.0002) 1.0021 [.002] {-,031 { - 1 {-.32) {-.27} {,I51 Note: See the note to Table 1 for notations.
Exports and transplant production r VER q u o t a binding
0 L
Figure 2. Japanese Car Exports and Transplant Production
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
do not change the desired demand process to be analyzed To capture the above features of the demand function, the following model specification is introduced
where yf is the logarithm of desired per capita demand in the U.S. for Japanese cars, yt the logarithm of observed per capita demand in the U.S. for Japanese cars, x2t is the logarithm of per capita real disposable income, x3t the logarithm of price ratio of Japanese cars to domestic cars and Lt the logarithm of VER at time t . This model is a modified demand regression function of Tsurumi and Tsurumi (1983) and ZT. Briefly, the economic justification of the independent variables is the following. Real but nominal per capita disposable income is chosen because the purchases of cars are treated as investment,^ in economics. A positive coefficient for this variable is thus expected. Following ZT, relative price, i e . , the price ratio of Japanese cars to the domestic cars, is chosen to capture price and cross-price effects. A negative coefficient for this variable is anticipated. To model the dynamic effect, it is important to ask whether Y,'-~ or y,-1 dictates the latent process of y,'. From consumer theory, the demand function is derived from utility maximization subject to budget constraints. Because it is reasonable to assume that VERs neither affect an individual's budget constraint nor an individual's preference, it is the lagged latent demands of yf that explain the desired demand y t . From an economic point of view, the lagged y,'s reflect the "inertial" effect of an individual's desired consumption. Thus, positive coefficients for the lagged desired variables are expected. Based on quarterly data from 1974 to 1992, our estimated results are reported in Table 5. The convergence checks of the MCMCs are the plots of CUMSUM statistics against t (see Figure 3). The basic message from the plots is that all of the chains converge after 2000 runs of initial draws (z.e., m= 2000). With a 27% censoring level, the computation of the posterior means and standard deviations of the parameters uses up approximately 10 minutes (CPU time) and the returned posterior draws are 5000 (z.e., A l = 5000). This sample is special since there is only one long latent string in it. ZT failed to obtain their estimates from the Gibbs sampler with the data augmentation by claiming that the long latent string and the large AR(1) coefficient In the error term of their model contaminated the iterating procedure of the Gibbs sampler. Our simulation in Section 4 produces fairly good results even with long latent strings and large value(s) of XI (and X2). The computation here performs quite satisfactorily as well. In Table 5 , Case I refers to model (17) with p = 1 and Case I1 to model (17) with p = 2. Line b refers to the OLS estimates with the censored data. Line c refers to the estimates of the proposed method with the censored data. The standard deviation of each posterior mean is in brackets. The numerical standard error of each posterior mean is in square brackets. The one period lagged correlation of the Gibbs run is in curly brackets. The results are basically consistent with our expectations except for the coefficient of the price ratio variable. The large standard deviation of this estimate indicates that the data information cannot narrow down enough our belief in this parameter. Using a truncated prior for p3: i.e.. E3 < 0, improves the results. This is done by setting 133 < 0 (see Gelfand, Smith and Lee (1992)). We only need to add one more line to our code for t,his problem. The results are reported in line d of Table 5 . For both cases, this economic constraint
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS Table 5. VER Trade Example
Parameter b c
+
d
Case I: Demand for Japanese Cars in the U.S. ( Sample Size 75 and Censoring Level 27% ) PI P2 P3 XI u2 -9.55 (4.14) .86 (.42) .04 (.35) .74 (.08) ,020 (.003) .20 (.42) .77 (.09) .023 (.004) -5.82 (4.32) .48 (.45) [.0006] [.006] [.002] [.070] [.007] {-.lo} {-,281 {-,061 { . - 0.23) {-,261 -6.19 (.89) .52 (.lo) -.09 (.07) .77 (.02) ,022 (.004) [.0002] [.004] [.0004] [.024] [.002] {.50) f.42) {-.69} {.24} {-,221
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Case 11: Demand for Japanese Cars in the U.S. ( Sample Size 75 and Censoring Level 27% ) Parameter b c
d
-+
PI
PZ
P3
.02 (.35) -10.00 (4.15) .89 (.42) 1 8 (41) -5.40 (4.32) .43 (.44) [.006] [.066] [.007] {-,081 {.02} {.005> -5.91 (1.04) .49 (.lo) - . l o (.08) [.003] [.040] [.004] {-,051 {-.04) {.I11
1
.67 (.12) .66 (.13) [.002] {-.07) .66 (.04) [.001] {.04}
XP .04 (.12) 10( 2 ) [.002] {-,007) 10( 2 ) [.001] {-.07}
u2
,020 (.003) .02 (.004) [.0006] {-,071 ,021 (.004) [.0001] {-.Ol j
Note: Line b refers to the OLS estimates with the censored data; line c to the estimates of the proposed method with the cenosred data; and line d to the estimates of the proposed method with the parameter constraint p~ < 0. The standard deviation of each psoterior mean is in brackets. The numerical standard error of each posterior mean is in square brackets. The one period lagged correlation of the Gibbs run is in curly brackets. The estimates are based on a=2000 and M =
5000.
improves the efficiency of the parameter estimates. The very small value of estimated X2 in Case I1 implies that y t can reasonably be described as an AR(1) process. A simple comparison of the OLS estimates (line b) with our Bayesian estimates (line c) indicates that the OLS method over-estimates the real income effect and under-estimates the desired demand "inertial" effect. The marginal prior (dotted curves) and posterior (solid curves) densities of all parameters for Case I1 are shown in Figure 4.
6
Conclusion
This paper has developed a simulation-based Bayesian method for the estimation of a dynamic Tobit model. Due to the analytical intractability of the likelihood functions of these models, traditional Bayesian and classical estimation methods are not suitable in this situation. More specifically, this is because the high dimensional integrals, induced by the dependence of the (unobserved) observations, in their likelihood functions, make it extremely difficult to evaluate directly both the likelihood and the posterior. The solution to the problem proposed in this paper is to develop a sampling scheme for the conditional posterior distribution of the latent data so that the Gibbs sampler with the data augmentation algo-
.--. 0
*
N -.
XI -1
0
m
9
~
0
0
r-
0
e
I
90
.
0 0
.
,
.
,
'
'
r-
.
1
YO 10 20 30 40 50 60 70 80 90
10 20 30 40 50 60 70 80 90
00
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
m .
sample size/50
N
2
sample size/50
0
-
r-.
0
z. (D
1
N
X'
-d e I
(D
N N
70 ' 1 0 2b3'0 40'5'06'0'708'0'90
9
'
70
sample size/50
.
.
.
.
.
.
.
.
.
. . .
10 20 30 40 50 60 70 80 90
sample size/50
a N-
*
0
9.
m
0
7 .
0
0
-
90
0
t
~
0
9. 0
N 0 0
I
~
0
(D
9 70
. . .
.
.
r-
,
10 20 30 40 50 60 70 80 90
sample size/50
.
J
70 1 0 2'0 3'0'4'05'0 6'0 7'0'8'0'90'
sample size/50
Figure 3. The CUMSUM Plots (cf. VER Trade Example)
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
Figure 4. Prior and Posterior Plots (cf. VER Trade Example) (continued)
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
Figure 4. Prior and Posterior Plots (Continued)
rithm is successfully applied. The concept of the latent string plays a role in this analysis since it provides an easy way to learn about the unobserved data. The advantages of this Bayesian approach are: (i) it provides a unifying approach to both stationary and nonstationary dynamic Tobit models; (ii) it is attractive from both theoretical and practical viewpoints; (iii) both informative and non-informative prior beliefs can be incorporated into the estimation process; ( i v ) this method performs satisfactorily in various circumstances, for example, with small sample sizes or nonstationarity, as shown in a Monte Carlo study; and (v) the method can be easily extended to dealing with dynamic Tobit models with Studentt errors. The proposed procedure is applied to a regression study of Japanese exports of passenger cars to the US subject to a non-tariff trade barrier. This exercise shows the appropriateness of using a dynamic Tobit model in such a situation and provides some useful experience with a real application.
ACKNOWLEDGMENTS I am deeply indebted to Dale J . Poirier for the stimulating discussions, comments and support. I am also grateful for the helpful comments by Gorden Kemp, Gary Koop. JinChuan Duan, Luc Bauwens, and the participants in the Econometric Seminar at the University of Toronto. I would also like to thank John Geweke who kindly provided me with his Fortran code on sampling from a truncated multivariate normal distribution. The paper by Zangari and Tsurumi (1996) was brought to my attention by Dale J . Poirier long after
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
437
a version of the present paper had been completed. This version takes their paper into account. I thank them for providing me with their data. I also thank two anonymous referees who helped improve the paper. Financial support from University of Toronto Doctor Fellowship and CORE research fellowship is greatly acknowledged. A previous version of this paper has appeared as # 9781 in CORE discussion paper series. All remaining errors are, however, the author's responsibility.
References [I] Amemiya, T . (1984): "Tobit Models: A Survey," Journal of Econometrics 24, 3-61.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
[2] Amemiya, T . (1985): Advanced Econometrics. Harvard University Press, Cambridge, MA. [3] Chib, S. (1992): "Bayes Inference in the Tobit Censored Regression Model," Joz~rnal of Econometrics 51, 79-99. [4] Dagenais, M.G. (1982): "The Tobit Model with Serial Correlation," Economics Letters 10, 263-267. [5] Gelfand, A.E. and Smith, A.F.M. (1990): "Sampling Based Approaches to Calculating Marginal Densities," Journal of American Statistical Association 85, 398-409.
[6] Gelfand, A.E., Smith, A.F.M. and Lee, T . (1992): "Bayesian Analysis of Constrained Parameter and Truncated Data Problems Using Gibbs Sampling," Journal o,f American Statistical Association 87. 523-532. [7] Geweke, J. (1991): "Efficient Simulation from the Multivariate Normal and Studentt Distributions Subject to Linear Constraints," in E.M. Keramidas ed., Computing Science and Statistics: Proceedings o,f the Twenty-Third Symposium of the Interface, 571-578.
[8] Geweke, J. (1992): "Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments (with discussion)," in J.M. Bernardo, J.O. Berger, A.P. David and A.F.M. Smith, eds. Bayesian Statistics 4. Oxford University Press, 169-193. [9] Geweke, J. (1993): "Bayesian Treatment of the Independent Student-t Linear Model," Journal o,f Applied Econometncs, 8, 19-40. [lo] Greene, W.H. (1993): Econometrics Analysis. 2d ed., Macmillan, New York. [ll] Hajivassiliou, V., McFadden, D. and Ruud, P. (1996): "Simulation of Multivariate
Normal Rectangle Probabilities and Their Derivatives: Theoretical and Computational Results," Journal o,f Econometrics 72, 85-143. [12] Kodres, L.E. (1988): "Tests of Unbiasedness in Foreign Exchange Future Markets: The Effects of Price Limits," Review of Futures Markets 7, 138-166. [13] Lee, L. (1997): "Estimation of Dynamic and ARCH Tobit Models," Working Paper, Department of Economics, The Hong Kong University of Science and Technology.
[14] Maddala, G.S. (1987): Limited Dependent and Qualitative Variables i n Econometrics. Cambridge University Press, New York. [15] Morgan, I.G. and Trevor, R.G. (1997): "Limit Moves as Censored Observations of Equilibrium Futures Price in GARCH Processes," Working Paper, School of Business, Queen's University. [16] Peristiani, S. (1994): "An Empirical Investigation of the Determinants of Discount Window Borrowing: A Disaggregate Analysis", Journal of Banking and Fznance 18, 183-197.
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
[17] Poirier, D.J. and Ruud, P.A. (1988): "Probit with Dependent Observations," Review of Economic Studies 5 5 , 593-614. [18] Poirier, D.J. (1995): Intermediate Statistics and Econometrics: A Comparative Approach. MIT Press, Cambridge. [19] Ripley, R.D. (1987): Stochastic Simulation. Wiley, New York. [20] Tanner, M.A. and Wong, W.H. (1987): "The Calculation of Posterior Distributions by Data Augmentation," Journal of American Statistical Association 82, 528-550. (211 Tobin, J. (1958): "Estimation of Relationship for Limited Dependent Variables," Econornetrica 26, 2436. 1221 Tsurumi, H. and Tsurumi, Y. (1983): "US-Japan automobile trade: a Bayesian test of a product life cycle," Journal of Econometrics 23 , 193-210. [23] Yu, B. and Mykland, P. (1994): "Looking at Markov Samplers through Cusum Path Plots: A Simple Diagnostic Idea," Technical Report 413, Department of Statistics, University of California a t Berkeley. [24] Zangari, P.J. and Tsurumi, H. (1996): "A Bayesian Analysis of Censored Autocorrelated Data on Exports of Japanese Passenger Cars to the U.S.:" in R. Carter Hill ed., Advances i n Econometrics 11. JAI Press Inc., 111-143. [25] Zeger, S.L. and Brookmeyer, R. (1986): "Regression Analysis with Censored Autocorrelated Data," Journal of Amemcan Statistical Association 81, 722-729. [26] Zellner, A. (1971): A n Introductzon to Bayesian In,ference i n Econometrics. Wiley, New York.
Appendix Proo,f o f Proposition 3.1. Model (1) implies that y2; is an AR(p) latent process. As noted by Zeger and Brookmeyer (1986), the standard hlarkov property does not hold with respect to y,. Using the Proposition 2.1 of Zeger and Brookmeyer, we have
form the most recent set of p consecutive where yt-,, is censored and y t + , , - ~ ,. . . . yt-,,-, uncensored observations preceding y2. Symmetrically, we can derive
BAYESIAN APPROACH TO DYNAMIC TOBIT MODELS
439
Downloaded by [Hong Kong Polytechnic University] at 22:25 03 July 2012
where yt+,, is censored and ~ t + ~ ~ +. . ,l yt+,,+, , form the most recent set of p consecutive uncensored observations followed by y f . The intuition behind ( A l ) and (A2) is simple. Given a sample { y t ) ~ = l(and the parameter vector 0) of model ( I ) , how can we efficiently learn about the latent data y,' ( y f 5 O)? The above two equations give the minimum set of historical and future values of y,, which are relevant to learning these unknowns (conditional on 0). ! , . . . ~ t + n ~ +is a~ latent } string. The Now suppose { Y ~ - ~ + I. ,. .r ~ t~ t, ; l , . . , ~ f + ~Yt+n,+~r above results imply that the two conditional distributions of the latent data in the latent string are equivalent: ( 2 ) conditioning on all observations in the sample (and the parameter vector 0 ); and ( 2 % ) conditioning only on the observations in the latent string (and the parameter vector 0). This means that any two latent data located in two distinct latent strings are conditionally independent given all the observables in the sample and the parameter vector 8.The result of Proposition 3.1 is then proved.