simulating transitions using discrete choice models

11 downloads 0 Views 155KB Size Report
Alan Duncan, University of York, UK and Institute for Fiscal Studies. Melvyn Weeks, Department of Applied Economics, University of Cambridge, UK. Abstract:.
SIMULATING TRANSITIONS USING DISCRETE CHOICE MODELS Alan Duncan, University of York, UK and Institute for Fiscal Studies Melvyn Weeks, Department of Applied Economics, University of Cambridge, UK Abstract: We discuss various techniques by which statistical models of discrete choice might be calibrated in order for predicted states to coincide exactly with observed outcomes. We apply these calibration methods to the problem of simulating transitions in discrete states in response to an exogenous shock, and find there to be important differences in the sensitivities of predicted transitions frequencies to the choice of calibration method. Acknowledgements: We are grateful to Karim Abadir, Richard Blundell and Hashem Pesaran for valuable comments and suggestions. The financial support of the ESRC Centre for the Micro-Economic Analysis of Fiscal Policy at IFS is gratefully acknowledged. The usual disclaimer applies.

1.

Introduction

In both the social and physical sciences the use of discrete choice or stimulus and response models is commonplace. This is based upon the recognition that many phenomena involve the choice between or passage through discrete, identifiable states. Survey articles such as Amemiya (1981) and provide a large number of references for applications in economics and bioassay. There has been a movement towards the use of such qualitative models to simulate behavioural transitions in response to some exogenous shock.1 While such applications are undoubtedly of great relevance, the policy impact of transitions studies of this kind tend to be offset to a degree by the notorious, empirically observed tendency for outcomebased measures of fit in such models to be relatively poor in cross-sectional studies. The literature abounds with empirical studies where the authors report a systematic over- or under-prediction of certain state frequencies2 . Whilst problems of this kind 1 See Duncan and Giles (1996) and Bingley and Walker (1995) for microeconometric applications. 2 see

Cramer (1996) for a review of this problem in both the social and natural sciences

may simply indicate some form of specification error, there is some suggestion that even in functionally well-specified models the predictive performance is poor, particularly where some states are relatively densely or sparsely represented in the data. In this paper we examine alternative transitions estimators which control for the systematic under- or over-prediction of state frequencies inherent in many discrete choice models. We derive an exact expression for the transitions estimator in a binary choice framework, and suggest how similar transitions estimators may be simulated when dealing with higher dimensional problems. A Monte Carlo study compares the properties of our suggested estimator with those more frequently found in the literature under alternative experimental designs.

2.

A Statistical Framework

Consider as a general framework a statistical model of the choice among J states, where the utility u∗j to be enjoyed in each state j can be expressed in terms of an observable set of k characteristics x = (x1 , x2 , .., xk )0 and an unobservable component εj with zero mean and covariance Ω.3 Assume that the utility to be enjoyed in each state depends linearly on characteristics, such that u∗j = x0 β j + εj ,

j = 1, ..., J

(1)

for sets of parameters β j = (β j1 , β j2 .., β jk )0 . These latent variables are related to observed quantities y ∈ (1, 2, .., J) via the mapping y =arg max [u∗j , j = 1, ..., J] j

We may then write the conditional domain of u∗ as B(y) = {u∗ |y = arg max[u∗ ]} . Based on the stochastic assumptions on εj and for a given set of parameters β = [β 1 , ..., β J ] we can (in principle, at least) evaluate probabilities Pj (x, β) = Pr(y = j | x, β) for each state j = 1, ..., J. 3 For notational simplicity, and without loss of generality, we restrict our attention to observable characteristics which are state invariant.

For i = 1, . . . , n indexing a random sample from the population, and for Pij (xi , β) = Pr(yi = j | xi , β), the log-likelihood for (1) may be written as L(β) =

n X X

yij ln Pij (xi , β)

(2)

i=1 j∈J

with the score S(β) =

n X X (∂ ln Pij (xi , )/∂β)yij .

(3)

i=1 j∈J

Among the range of estimation procedures available for models of this kind, one may apply Maximum Likelihood Ptechniques to solve the first order condition 1/n i=1 si (β) = 0, where si (.) is the ith contribution to the score. Alternatively, the method of moments (or simulated moments) may be used. 2.1

measuring predictive performance

simulating transitions

Another empirical practice which has gained some currency in recent studies is to use a qualitative model of modal choice as a basis for simulating transitions between states in response to some exogenous shock (see Duncan and Weeks (1997)). Consider a counterfactual where the vector of characteristics xi for each observation in the sample are superceded by some other set xR i 6= xi . Based upon the same parameter vector β, we may simulate the counterfactual state ybiR for the ith observation by employing the same maximum probability rule as before, but in terms of the counterfactual xR i , giving ybiR =arg max [Pij (xR i , β), j = 1, ..., J].

(4)

j

One outcome-based performance measure for a qualitative model of this form would compare frequencies of predicted and observed outcomes over the range of y across the sample.4 Defining ° °the N xJ matrix °b ° b yi = j) for YMP with typical element °YMP ° = 1(b all i, j, where 1(.) represents the indicator function, we can summarise the proximity of predicted to obb MP , served states in terms of the JxJ matrix N1 Y0 Y the trace of which represents the proportion of observations for which the predicted and observed states coincide. For a model to predict perfectly requires b MP ) = 1, an empirical feat which is that tr( N1 Y0 Y rare.5 4 This is not the only measure of fit available; see Windmeijer (1995) for a useful comparison of goodness-of-fit measures in binary choice models.

5 As Pudney (198?) notes, this particular measure is misleading since it has no asymptotic justification. That is to

(5)

j

Standard practice builds a matrix of transitions frebR quencies as follows: ° °define a matrix YMP with typ°bR ° y R = j) for all i, j. Then, ical element °Y ° = 1(b MP

Consider now the evaluation of model (1) based on a sample of data of size N . Let the set {yi , xi } for i = 1, .., N denote observations on y and x, and define the N xJ matrix Y = [Y1 , .., YJ ] with typical element kYk = 1(yi = j) for all i, j. For a given parameter vector β, we may compile a matrix P of sample probabilities with typical element Pij (xi , β) = Pr(yi = j | xi , β). A common empirical practice is to model discrete outcomes on the basis of probabilities Pij (xi , β) using a maximum probability rule. If we let ybi represent the predicted state, then ybi =arg max [Pij (xi , β), j = 1, ..., J].

2.2

i

a summary of transitions frequencies based on the maximum probability rules (4) and (5) can be expressed in terms of the JxJ matrix 0 R b MP b MP Y TMP = Y

(6)

the trace of which represents the number of observations for which the predicted states remain the same when one moves from the base to the counterfactual regime. Clearly, tr(TMP ) = N for a counterfactual where no transitions are predicted. However, since the transitions frequencies are constructed using the maximum probability rule, there is no guarantee that they will converge to their true values even if the behavioural parameters of the discrete choice model are themselves consistent. It is therefore difficult to place any real faith in predicted transitions frequencies of this form unless and until we are able to correct in some way the finite sample bias that exists in predictions of frequencies under the maximum probability rule. 2.3

Calibration Methods

Consider the set of J state-specific utilities u∗ij = x0i β j + εij for the ith observation in a sample of size N . To force this model to predict the observed outcome yi through the maximum probability mapping (4), we must place bounds on the values of the unobservable components of utility εij . By exploiting any assumptions that are made about the stochastic

1 say, p lim tr( N Y 0 YˆM P ) 6= 1 even in the case of a well specified model.

distribution of the model, we may generate a realisation eij which can then be factored into latent relationships of the form u∗ij = x0i β j + eij to recover predictions ybi which coincide exactly with yi . Suppose that we observe the ith outcome as yi = j ∗ ∈ (1, .., J). If our assumed model relies on the latent relationships u∗ij = x0i β j + εij for j = 1, .., J, then for ybi = j ∗ through the mapping (4) requires that u∗ij ∗ > u∗ij for all j 6= j ∗ , which implies that

a calibrated state predictor

Our two proposed calibration methods derive realisations eij − eij ∗ which respect the bounds (7) on the unobserved components of utility and which will guarantee that ybi = yi for all i. We may evaluate eij − eij ∗ as the conditional expectation

ei = F −1 [1(yi = 1){υ.Fi +(1−υ)(1−Fi )}+υ(1−Fi )] (14) gives the conditional draw, where Fi = F (x0i β) for some distribution function F (.) and υ ∼ U [0, 1]. This gives an alternative state predictor (calibrated using 14) of the form

(εij − εij∗ ) < x0i (β j ∗ − β j ) for all j 6= j ∗ .

(7)

(eij − eij ∗ ) = E[εij − εij∗ |εij − εij ∗ < x0i (β j ∗ − β j )]; (8) Gourieroux, Monfort, Renault, and Trognon (1987) refer to this expression as the (conditional) prediction error, otherwise known as the generalised residual. Alternatively we may realise eij −eij ∗ by drawing at random from the conditional distribution of u∗i given yi ; f (εij − εij ∗ |εij − εij∗ < x0i (β j∗ − β j )).

(9)

ˆ in either For β unknown we simply replace β by β case. the binomial case Consider the simplest version of the statistical model (1) outlined above for which J = 2, such that u∗i1

= x0i β 1 + εi1

u∗i2

= x0i β 2 + εi2

(10)

= x0i (β 2 − β 1 ) + (εi2 − εi1 ) = x0i β + εi .

(11)

To calibrate this model using (8) for yi = j ∗ ∈ (1, 2) we may evaluate the generalised residuals ei = E[εi | yi = j ∗ , xi , β]

(12)

given a distribution for εi .6 . The calibrated latent − u∗i2 = x0i β +ei , to yield predictor then becomes u∗i2d 6 If we assume, for example, that the disturbances ε ij are bivariate normal, then the differenced disturbance εi = εi2 − εi1 is univariate normal and ei = φi .[(yi − 1) − Φi ]/[Φi .(1 − Φi )], where φi = φ(x0i β) and Φi = Φ(x0i β); see Gourieroux and Montfort (1993).

(13)

The alternative calibration technique (9) requires that we draw eij − eij ∗ at random from the conditional distribution f(εij − εij ∗ |εij − εij ∗ < x0i (β j ∗ − β j )) = f (εi | yi = j ∗ ). In the binomial case, we may do so by applying the Inverse Transformation Theorem. In general,

ybiCD = 1 + 1(x0i β + ei > 0).

(15)

In higher dimensional problems, we require the use of simulation methods to approximate a conditional moment based upon the distribution of the underlying latent variable - namely we must evaluate the conditional expectation E(ε|y). In discrete choice models where the stochastic terms are normally distributed, the use of generalised residuals as a calibration tool utilises simulation techniques in similar form as in the method of simulated scores. An example of this approach is the simulated EM (SEM) algorithm which approximates the expectations operator by averaging repeated draws from the conditional distribution of u∗ given y, and therefore requires draws from the truncated multivariate normal distribution.

3.

Expressing (10) in differenced form yields u∗i2 − u∗i1

ybiCE = 1 + 1(x0i β + ei > 0).

Simulating Predicted State Transitions

We have already highlighted the possibility of using discrete choice models to simulate transitions between states, and noted that the current practice (of using the maximum probability rule based on (5)) does not in general lead to correct predictions of transitions frequencies. Here we discuss alternative means by which more accurate transitions frequencies may be predicted. In simpler (lowerdimensional) problems and under specific distributional assumptions we derive explicit closed-form expressions for transitions frequencies. Where no closed-form solution exists we suggest how one might approximate transitions frequencies using simulation methods.

3.1

theoretical transitions frequencies

Consider again a probabilistic model of discrete choice for a sample of data of size N where yi ∈ (1, ..., J) represent a discrete state indicator and xi denotes a set of characteristics. For utilities u∗ij = x0i β j + εij , j = 1, ..., J, we may derive probabilities Pij = Pr(yi = j | xi , β) for all states j given a specific distribution for εij . Our interest centres on the effects on these probabilities following an exogenous ∗ R shock xR i 6= xi which alters utilities from uij to uij = 0 (xR i ) β j +εij . That is, we are in general interested in the probability Pi(j ∗ →j R ) of transition from any state j ∗ to j R , where in general Pi(j ∗ →j R ) = Pr(uR ij R > R ∗ ∗ ∗ ∀j = 6 j |u > u ∀j = 6 j ). When utilities are uR ∗ ij ij ij expressed in linear form, Pi(j∗ →j R ) = Pr(yi = j ∗ | 0 R xi , β). Pr(εij − εij R < (xR | i ) (β j R − β j ) ∀j 6= j 0 ∗ ∗ εij − εij < xi (β j ∗ − β j ) ∀j 6= j ). To condition transitions probabilities on observed yi , simply replace Pr(yi = j ∗ | xi , β) by 1(yi = j ∗ ). By cumulating transitions probabilities (either unconditional or conditioned on yi ) over the sample, we may derive theoretical transitions frequencies n(j ∗ →jR ) where n(j ∗ →j R ) =

N X i=1

Pi(j ∗ →j R ) for all j ∗ , j R = 1, ..., J. (16)

and Pi2 in (17) to (20) by 1(yi = 1) and 1(yi = 2) respectively. 3.2

Although an explicit solution exists for transitions frequencies in the binomial case, it is less easy to derive equivalent expressions for higher-dimensional models. In this paper, we discuss an alternative strategy which adjusts the maximum probability rule to rectify the problem of misclassification using the naive criterion (5). Our proposed method brings the method of calibration to bear on the problem of predicting transitions following an exogenous shock xR i 6= xi . For tractability, we focus initially on the binomial case. Define the calibrated state predictors

and

for all i = 1, ..., N , where Pi2 = 1 − Pi1 = Fi = 0 0 R 0 F (x0i β), FiR = F [(xR i ) β], Ii = 1[xi β < (xi ) β] and R 0 R 0 Ii = 1[xi β ≥ (xi ) β]. A matrix T of theoretical transitions frequencies may then be defined as T =

N X

Ti .

(21)

i=1

We may condition transitions frequencies on the observed discrete state indicator yi by replacing Pi1

CE(R)

0 = 1 + 1[(xR i ) β + ei > 0]

(22)

CD(R)

0 = 1 + 1[(xR i ) β + ei > 0],

(23)

ybi

ybi

where ei and ei are given by (13) and (15) rebR spectively, and consider two NxJ matrices Y ° °CE bR ° b R with typical (i, j)th elements ° Y and Y ° CD CE ° = ° ° °bR ° CE(R) CD(R) 1(b yi yi = j) and °Y = j) for CD ° = 1(b all i, j. By direct analogy with (6) we may construct transitions matrices

the binomial case Let us again consider the simple case for which J = 2. Based on the statistical model (11) and given some distribution function F (.) for εi , we have that ½ ¾ 1 − FiR .Ii + IiR Pi(1→1) = Pi1 . (17) 1 − Fi ½ R ¾ Fi − Fi .Ii Pi(1→2) = Pi1 . (18) 1 − Fi ½ ¾ Fi − FiR R .Ii Pi(2→1) = Pi2 . (19) Fi ½ R ¾ Fi R .I + Ii Pi(2→2) = Pi2 . (20) Fi i

simulated transitions frequencies

and

0 R R b CE b CE b CE TCE = Y Y = Y0 Y

0 R R b CD b CD b CD TCD = Y Y = Y0 Y ,

(24) (25)

each of which has been calibrated to replicate observed states with 100% accuracy. Of course, given the form of either the generalised residual ei or the simulated residual ei , this is hardly surprising. Indeed, one can guarantee perfect state predictions regardless of whether or not the mean equation is a correct specification of the underlying data generating process. However, conditional on a correct specification of the mean equation, the marginal effects are consistent also. At least at an intuitive level, this makes it more likely that the transitions frequencies themselves will be more reliable than those in (6). Whether they converge to the true transitions frequencies (21) is a question we address in the context of a Monte-Carlo experiment.

4.

Computational Results

To compare the various transitions estimators covered in this paper, we assess performance within the framework of a simulated sample design. By doing so

Table 1: Monte-Carlo Simulations b n(1)

b n(2)

balanced design b n(1→1) b n(2→1)

b n(2)

balanced design b n(1→1) b n(2→1)

b n(2)

balanced design b n(1→1) b n(2→1)

truth

495.05

504.95

MP UT CT CE SCD M CD

-1.07 0.24 0.24 0.24 0.24 0.24

1.05 -0.23 -0.23 -0.23 -0.23 -0.23

495.05

318.68

-1.07 0.24 0.24 0.24 0.24 0.24

19.56 -0.39 -0.45 -1.97 -0.44 -0.46

i) negative transition impact: ∆ = −1

b n(1→2)

0.00

0.00 0.00 0.00 0.00 0.00 0.00

b n(2→2)

b n(1)

b n(2)

186.27

766.78

233.22

-30.61 0.03 0.14 2.73 0.12 0.15

13.52 -0.04 -0.04 -0.04 -0.04 -0.04

-44.44 0.12 0.12 0.12 0.12 0.12

unbalanced design b n(2→1)

b n(1→2)

b n(2→2)

unbalanced design b n(2→1)

b n(1→2)

b n(2→2)

unbalanced design b n(2→1)

b n(1→2)

b n(2→2)

b n(1→1)

766.78

171.76

13.52 -0.04 -0.04 -0.04 -0.04 -0.04

-32.53 -0.08 -0.18 0.84 -0.49 -0.19

0.00

0.00 0.00 0.00 0.00 0.00 0.00

61.46

-77.69 0.68 0.96 -1.91 1.80 0.96

ii) balanced transition impact: ∆ = 0

b n(1)

truth

495.05

504.95

MP UT CT CE SCD M CD

-1.08 0.23 0.23 0.23 0.23 0.23

1.06 -0.23 -0.23 -0.23 -0.23 -0.23

399.05

100.52

-9.68 0.30 0.41 6.51 0.43 0.39

31.19 -0.32 -0.45 -31.10 -0.30 -0.48

b n(1→2)

96.00

34.64 -0.08 -0.53 -25.89 -0.61 -0.44

b n(2→2)

b n(1)

b n(2)

b n(1→1)

b n(2)

b n(1→1)

404.44

766.78

233.22

-6.43 -0.20 -0.17 7.45 -0.21 -0.16

13.52 -0.04 -0.04 -0.04 -0.04 -0.04

-44.44 0.12 0.12 0.12 0.12 0.12

664.65

55.59

10.87 -0.02 0.00 7.07 0.03 0.00

-23.13 -0.09 -0.14 -16.72 -0.09 0.02

102.13

30.72 -0.17 -0.27 -46.27 -0.45 -0.25

177.63

-51.08 0.19 0.20 5.39 0.19 0.15

iii) p ositive transition impact: ∆ = +1

b n(1)

truth

495.05

504.95

MP UT CT CE SCD M CD

-0.98 0.30 0.30 0.30 0.30 0.30

0.97 -0.30 -0.30 -0.30 -0.30 -0.30

179.89

0.00

-35.04 0.85 0.90 4.02 0.62 0.88

0.00 0.00 0.00 0.00 0.00 0.00

b n(1→2)

315.15 18.45 -0.01 -0.03 -1.82 0.12 -0.03

b n(2→2)

504.95

766.78

233.22

0.97 -0.30 -0.30 -0.30 -0.30 -0.30

13.52 -0.04 -0.04 -0.04 -0.04 -0.04

-44.44 0.12 0.12 0.12 0.12 0.12

we enjoy an element of control sufficient to highlight the conditions under which predicted transitions deviate from what we know is the true data generating process.7 We simulate a binomial version of the general statistical design (1) which gives a (differenced) latent variable of the form u∗i = u∗i2 − u∗i1 . This latent variable is assumed to depend linearly on a set of three characteristics xi = [1, x2i , x3i ]0 through the parameter vector β = [β 1 , 1, −1] to give a latent relationship of the form u∗i

x0i β

= + εi = β 1 + x2i − x3i + εi ,

b n(1)

(26)

for i = 1, ..., N where x2i ∼ N (0, 1), x3i ∼ U (−1, 1) and εi is logistically distributed with mean 0 and variance π2 /3. All are independently drawn. The 7 We nevertheless recognise that Monte Carlo results based on such an approach are clearly design-specific, and may not relate directly to the sorts of economic problems most regularly confronted by applied researchers.

377.21

0.00

-5.03 0.24 0.25 5.03 0.23 0.24

0.00 0.00 0.00 0.00 0.00 0.00

389.57

31.46 -0.31 -0.32 -4.95 -0.30 -0.31

233.22

-44.44 0.12 0.12 0.12 0.12 0.12

binomial indicator variable yi relates to (26) through the mapping yi = 1 + 1(x0i β + εi > 0). By adjusting the value of the constant term β 1 in (26), we are able to control the probability of observing yi = 1 or 2. We consider two cases; a balanced design for which P r(yi = 1) = 0.5 and an unbalanced design for which P r(yi = 1) = 0.75.8 We examine transitions following an impact on x2i . For the simulated design, we predict transitions once x2i is superceded by xR 2i , where xR 2i = x2i + z.(U [−1, 1] + ∆).

(27)

The parameters z and ∆ in (27) control, respectively, the scale and the direction of the impact on x2i . The larger is z, the larger is the scale of the impact. If ∆ = +1 (-1) the impact is entirely positive (negative), whereas for ∆ = 0 the impact is balanced. Monte-Carlo results for this experimental design are reported in Table 1. The sample size is set at N =1000. We generate 5000 replications of εi (for 8 For

a similar Monte-Carlo design, see Windmeijer (1995).

fixed xi ) and cumulate transitions matrices follow5. Conclusions ing impacts of the form z = 3, ∆ ∈ (−1, 0, 1). We focus in this paper on various methods by We report in each table the true state frequenwhich transitions following an exogenous shock to cies n b(j) for states j = {1, 2} and the true transithe deterministic component of models of discrete tions frequencies n b(j→k) from state j to state k, for choice. Concentrating on low-dimensional problems, j, k = {1, 2}applying T to the true parameter vector we have been able to confirm the bias of naive disβ (denoted ”truth”), and the percentage deviations crete state predictors based on the standard maxifrom these benchmarks for the following six transimum probability rule, and show that the problems tions estimators: MP (averaging TMP over repeated associated with such predictors extend to the reliab estimates of β); UT (averaging T over repeated esbility of standard methods by which transitions are b timates of β);CT (averaging T over repeated estisimulated. In a two-dimensional framework, explicit b conditioned on yi ); CE (averaging TCE mates of β, forms for transitions frequencies have been proposed, b SCD (averaging TCD over repeated estimates of β); the reliability of which have been confirmed usb MCD (averaging TCD over repeated estimates of β); ing both simulated and sample-survey based Monteb and repeated draws of over repeated estimates of β Carlo designs, with particular emphasis on the preei ). diction of labour market transitions. We exploit various methods by which discrete choice models For the balanced design, the results in Table 1 may be calibrated to offer alternative transitions esconfirm a number of suspicions. Compared with the timators which perform well in Monte-Carlo simubenchmarks, note first how unreliable is the maxlations. An extension is suggested which offers a imum probability estimator TMP for the majority robust and unbiased alternative to transitions estiof experimental designs. Even though the predicted mator in higher-dimensional problems. state frequencies are broadly correct, we find a systematic over-prediction of off-diagonal transitions of up to 35% depending on the direction of the tranREFERENCES sition impact. For the majority of the alternative transitions estimators, Monte-Carlo evidence indi-Amemiya, T. (1981): “Qualitative Response Models: A cates broad convergence to the true values regardSurvey,” Journal of Economic Literature, 19, less of ∆. The one exception to this general pattern is 483—536. the transitions estimator TCE calibrated on the gen-Bingley, P., and I. Walker (1995): “Labour Superalised residual, which under-predicts off-diagonal ply, Unemployment and Participation in intransitions particularly for ∆ = 0. The intuition work Transfer Programmes,” Discussion paper, for this result is that a calibration using ei tends to Institute for Fiscal Studies. place the pre-shock predicted probability well awayCramer, J. S. (1996): “Predictive Performance in the from the critical level 0.5, and makes the exogenous Binary Logit Model,” mimeo, Tinbergen Instishock less likely to force the post-shock probability tute, Amsterdam. across that boundary. Duncan, A., and C. Giles (1996): “Labour Supply Incentives and Recent Family Credit Reforms,” We get similar results for the unbalanced design, Economic Journal, 106, 142—155. with more severe biases relative to the true frequenDuncan, A., and M. Weeks (1997): “Behavioural Tax cies for estimators TMP and TCE . Notice that, when Microsimulation with Finite Hours Choices,” the design becomes unbalanced, the predicted state European Economic Review, 41, 619—626. frequencies from TMP deviate markedly from the Gourieroux, C., A. Monfort, E. Renault, and truth. As noted by Windmeijer (1995), the use of A. Trognon (1987): “Generalised Residuals,” the naive maximum probability rule tends to underJournal of Econometrics, 34, 5—32. predict (over-predict) the sparse (dense) state due to the wasteful nature of the metric which translatesGourieroux, C., and A. Montfort (1993): “Simulation-Based Inference: A Survey with the predicted probability into a discrete state preSpecial Reference to Panel Data Models,” Jourdiction. The under-prediction of off-diagonal transinal of Econometrics, 59, 5—33. tions by TCE is also manifestly clear, with errors of Windmeijer, F. (1995): “Goodness-of-Fir Measures in up to 46% for some designs. Again, all other meaBinary Choice Models,” Econometric Reviews, sures perform well. 14, 101—116. Repeating these simulations over different sample sizes and for transitions impacts of different magnitudes, the same overall patterns are observed.