Estimating Party Policy Positions with Uncertainty Based on Manifesto ...

Estimating Party Policy Positions with Uncertainty Based on Manifesto Codings∗ Kenneth Benoit Trinity College Dublin [email protected]

Michael Laver New York University [email protected]

Slava Mikhaylov Trinity College Dublin [email protected]

August 21, 2007

Abstract Spatial models of party competition are central to modern political science. Before we can elaborate such models empirically, we need reliable and valid measurements of agents’ positions on salient policy dimensions. The primary empirical times series of estimated party positions in many countries derives from the content analysis of party manifestos by the Comparative Manifesto Project (CMP). Despite widespread use of the CMP data, and despite the fact that estimates in these data arise from documents coded once, and once only, by a single human researcher, the level of error in the CMP estimates has never been estimated or even fully characterized. This greatly undermines the value of the CMP dataset as a scientific resource. It is in many ways remarkable that so much has been published in the best professional journals using data that almost certainly has substantial, but completely uncharacterized, error. We remedy this situation. We outline the process of generating CMP document codings and positional estimates. Error in this process arises, not only from the obvious source of coder unreliability, but also from fundamental variability in the stochastic process by which latent party positions are translated into observable manifesto texts. Using the quasisentence codings from the CMP project, we reproduce the error-generating process by simulating coder unreliability and bootstrapping analyses of coded quasi-sentences to reproduce both forms of error. Using our estimates of these errors, we suggest and demonstrate ways to correct otherwise biased inferences derived from statistical analyses of the CMP data.

Key Words: Comparative Manifesto Project, Mapping party positions, party policy, error estimates, measurement error.

∗ Prepared

for the 2007 Annual Meeting of the American Political Science Association, Hyatt Regency and Sheraton Chicago, Chicago, Illinois, August 30–September 2, 2007. This research was partly supported by the European Commission Fifth Framework (project number SERD-2002-00061), and by the Irish Research Council for Humanities and the Social Sciences. We thank Andrea Volkens for generously sharing her experience and data regarding the CMP, and Thomas Daubler for research assistance. We also thank James Adams, Garrett Glasgow, Simon Hix, Abdoul Noury, and Sona Golder for providing and assisting with their replication datasets and code. An outline form of this paper was drawn up for the Conference on the Dynamics of Party Position Taking, March 23-24, 2007, University of Binghamton.

1

Introduction

Any scholar interested in understanding party competition is interested, sooner or later, in measuring the policy positions of political parties. There are many different ways to do this, including but not limited to the analysis of: legislative roll calls; survey data on preferences and perceptions of political elites; survey data on preferences and perceptions of voters; surveys of experts familiar with the political system under investigation; the analysis of political texts generated by political agents of interest. Benoit and Laver (2006) review and evaluate these different approaches. Here we focus on one source of such measurements, the long time series of estimated party policy positions generated by the Comparative Manifestos Project (CMP) and first reported in 1987. Over the years since then, the CMP has steadily built up a huge and important dataset on party policy in a large number of countries over the entire post-war period, based on the content analysis of party manifestos. This was reported in the project’s core publication, Mapping Policy Preferences (Budge, Klingemann, Volkens, Bara & Tanenbaum 2001, hereafter MPP), to have covered thousands of policy programs, issued by 288 parties, in 25 countries over the course of 364 elections during the period 1945-1998. The dataset has recently been extended, as reported in the project’s most recent publication Mapping Policy Preferences II (Klingemann, Volkens, Bara, Budge & McDonald 2006, hereafter MPP2) to incorporate 1314 cases generated by 651 parties in 51 countries in the OECD and central and eastern Europe (CEE) over the periods 1990-2003. These data are, commendably, freely available from the CMP and have been very widely used by the profession, as can be seen from over 800 Google Scholar citations by third-party researchers of the core CMP publications.1 The data have been extensively used by third-party researchers in four major areas of political science: in descriptive analyses of party systems (e.g. Bartolini & Mair 1990, Evans & Norris 1999, Mair 1987, Strom & Lejpart 1989, Webb 2000); in the analyses of party competition (e.g. Adams 2001, Janda, Harmel, Edens & Goff 1995, Meguid 2005, Van Der Brug, Fennema & Tillie 2005); models of coalition building and government formation (e.g. Baron 1991, Schofield 1993); in analyses of responsiveness of representative democracy in linkages between government programs and governmental policy implementation (e.g. Petry 1991, Petry 1988, Petry 1995). Furthermore, the CMP data have been used in such diverse areas as, for example, analysis of political behavior in the US (e.g. Erikson, Mackuen & Stimson 2002); evaluation of partisan effects on government expenditure (e.g. Bräuninger 2005); iden1 The precise number of third-party citations is hard to calculate because third-party users are likely to cite several CMP sources in the same paper.

2

tification of structure and dimensionality of the political space of European Parliament, European Commission and Council of Ministers (e.g. Thomson, Boerefijn & Stokman 2004); evaluation of the issue convergence in the US presidential campaigns (e.g. Sigelman & Emmett 2004); analysis of the relationship between budgetary cycles and political polarization and transparency (e.g. Alt & Lassen 2006); evaluation of partisan effects on trade policy (e.g. Milner & Judkins 2004); establishing the effect of endogenous deregulation on productivity in OECD (e.g. Duso & Röller 2003); and also as a means to validate other measures of parties’ policy positions, e.g. expert surveys (e.g. Ray 1999). Notwithstanding the fact that the CMP project is ostensibly grounded in a “saliency” theory of party competition that assumes “all party programmes endorse the same position, with only minor exceptions” (MPP, 82), third-party scholars have overwhelmingly used these data to estimate different party positions. Indeed, the CMP itself has used changes in party positions over time, especially on its left-right positional scale, to validate its own estimates. To a very large extent, the CMP’s estimated time series of parties’ left-right positions has been the dataset’s overwhelming attraction for third-party researchers. For scholars seeking long time series of party policy positions in many different countries, the CMP dataset is effectively the only show in town. Many significant publications have depended on these estimates. Despite the wide range of researchers who have depended on CMP estimates of party policy positions for their key empirical results, however, these data are provided without a basic feature considered essential to any estimate: a measure of the uncertainty surrounding the estimated quantity. For reliable and valid use of the CMP data, such measures of uncertainty are fundamental. Without them, users of the data cannot distinguish between “ signal” and “ noise”, making it impossible to tell the difference between measurement error and “ real” movements in party policy positions from one election to another. If we cannot tell whether two CMP estimates differ because of a change in the underlying signal, or because of error in the data—whether from measurement or from fundamental variability—then this drastically undermines CMP data in terms of their primary value for third-party research: as a rich time series of party policy positions. In this paper, we set out to address this problem and thereby make the CMP data far more useful for scholars who want to use these to analyze time series of party positions. We do this by decomposing possible stochastic elements in the data generation process that leads to the CMP estimates and simulating the effects of these. These stochastic processes concern, inter alia, the manifesto generation process. Two authors, or one author at two different times, trying sincerely to express an identical party position

3

are very likely to generate somewhat different manifestos. Thus the observed manifesto is one of many different manifestos that could have been written to express the same position. Given an observed manifesto, a second stochastic process involves manifesto coding. Two different coders, or the same coder working at two different times, are very likely to code the same manifesto in somewhat different ways.2 The result of our analyses of these stochastic processes is a set of simulated standard errors for the CMP data that allow scholars to make more informed judgments, when they compare two CMP estimates of different parties at the same time or of the same party at different times, about whether these differences are likely to be “real” , or the result of stochastic features of the data generation process. In what follows, we first provide a brief description of the CMP dataset and the processes that led to its generation. We then analyze the two main stochastic processes that are involved: manifesto writing and manifesto coding. Based on these analyses, we show how to calculate standard errors for each estimate in the CMP dataset. Analyzing these error estimates, we find that many CMP quantities should be associated with substantial uncertainty. Exploring the effects of this uncertainty on substantive work, we use show how the error estimates can be brought to bear in judging substantive change versus measurement error in both time-series and cross-sectional comparisons of party positions. Finally, we suggest ways to use our error estimates to correct analyses that do use CMP data as covariates, and use these to re-run some prominent analyses reported in recent literature.

2

The CMP coding scheme

The coding of party manifestos by CMP researchers has been very extensively described in the CMP’s core publications (MPP and MPP2). We do not repeat these descriptions here, focusing only on core features of this process that inform our analyses below. As is well-known by all who use these data, the CMP’s coding of party manifestos was based on a 56-category coding scheme developed in the mid-1980s, building on earlier work by David Robertson (1976).3 The process by which the CMP dataset was constructed consists of several steps. First, researchers had to identify a “ manifesto” for a particular party at a particular election. Often this document was obvious, although sometimes it was not. Not all parties issued what we now think of as a manifesto at every election, in which case an alternative document had to be identified and coded. We do not 2 We do not consider here the issue of coder bias—the potential for which arises because coders know, before they start coding, which party generated the manifesto under investigation. 3 We do not here consider the substantive validity of this scheme, noting only that the particular 56-category CMP coding scheme that generated these data is one of many possible coding schemes that might have been devised.

4

consider this issue further, and assume that all documents coded are valid statements of parties’ electoral positions—i.e., manifestos or valid manifesto surrogates. Second, the manifesto was read by a trained human coder and broken down into sentences, or what the CMP calls “quasi sentences” . Focusing on quasi-sentences provides a means to break down a complex sentence into self-contained units, each expressing a distinct message. These units might, had the manifesto been written by a different author with different stylistic proclivities, have been written as standard, syntactically separate sentences. The process of identifying quasi-sentences, unlike the identification of sentences which only requires the coder to be able to recognize a “full stop” (or “ period” west of the Atlantic) when s/he sees one, thus: (a) fundamentally recognizes the inherently stochastic nature of the text-generation process; (b) reflects a subjective judgment that might in practice vary between coders. In practice, as we shall see below, this means that the total number of quasi-sentences identified for any given manifesto does in fact vary substantially among different coders. Despite this potential variance in even the basic text units to be coded, the estimates reported in the CMP dataset are based on manifestos coded once, and only once, by a single coder. Having broken the text under analysis into quasi-sentences, the coder goes though the text, quasisentence by quasi-sentence, and assigns each of these to one (and only one) policy category in this scheme, or to an “uncoded” category. (The proportion of uncoded quasi-sentences in each manifesto in the CMP dataset is typically in the range 0-5 percent, as we will see below.) Once a manifesto has been coded in this way, the number of quasi-sentences coded into each category is counted, and reported as a percentage of the total number of quasi-sentences in the manifesto. These percentages, by coding category, are the fundamental quantities in the CMP data that are used by third-party researchers.4 While the summaries reported in the CMP dataset may look on the surface like quantitative measures of party policy positions, therefore, they are in fact fundamentally qualitative.5 Although the total number of quasi-sentences identified in each manifesto is reported the CMP dataset, (and summarized in Table 1), this quantity is almost never referred to by subsequent users of the data. However, about 30 percent of all manifestos had less than 100 quasi-sentences, coded into one of 56 categories. Some manifestos had less than 20 quasi-sentences; some had more than 2000 quasi-sentences. Despite this wide variation in the amount of policy information in different manifestos, all estimated party policy positions are almost 4 We

do not deal here with the very significant statistical issues arising from the fact that the CMP data as typically used are thus “compositional” , in the sense that CMP estimates for all policy categories are constrained to sum to a constant. 5 We stress that we do not regard this in any way as a criticism of the CMP dataset, since it is a fundamental feature of all human-coded content analysis. It is a crucial point to appreciate, however, when assessing the stochastic processes involved in data generation.

5

invariably treated in the same way, regardless of whether they are derived from a coding of 20 text units, or 2000. We return in some detail to this matter below, when discussing ways to use text length to estimate uncertainty for the CMP quantities. In practice, most scholars do not use the full 56-category set of CMP coding categories since they typically seek a low-dimensional representation of party policy positions. It is very common practice, therefore, for scholars to use the composite left-right scale developed and tested by the CMP. Based on subjective judgments about the meaning of particular categories, supplemented by exploratory factor analyses, this simple additive scale combines information from 26 CMP policy variables, 13 referring to left wing positions and 13 referring to right wing positions. The CMP itself refers to the measurement of party movement on this left-right positional scale as its “crowning achievement” (MPP, 20) and, as noted, has relied extensively on such movements for the establishing the face-validity of its estimates. We do not examine the substantive validity of this scale here. We do, however, devote considerable attention to an analysis of potential errors in the estimation of this left-right scale, since it is the CMP measure used by most third party researchers. Having described the generation of the CMP dataset, we turn to the main substance of this paper. The methodological issues we consider are as follows. (a) There is a stochastic process involved in the generation of particular manifestos by particular political parties, even if there is absolutely no intra-party disagreement about policy. (b) The number of quasi-sentences in any given manifesto is not directly observed, but is a matter of subjective judgment by a single coder. (c) There is a stochastic process involved in the coding of particular manifestos by particular coders, when assigning observed particular manifesto quasi-sentences to particular CMP coding categories. In what follows, we investigate the sensitivity of CMP estimates to each of these aspects of the data generation process. In order to formalize our description of the process which produces the observed CMP dataset, we characterize this process statistically.

3 3.1

The Stochastic Process Generating CMP Data Statistical characterization of the CMP coding process

Consider the 56-category coding scheme as a set of mutually exclusive, exhaustive, and fixed categories into which each sentence must be coded. Treating “uncoded” text as a single category, this gives a total of k = 57 categories. For a single manifesto, we describe observed quasi-sentence counts for each

6

coding category j by a set of random variables X j that indicate the number of times category j was coded from this manifesto. For the CMP coding scheme, when j = 1 for example, x1 corresponds to the number of quasi-sentences coded to the first CMP category, labeled “101: Foreign special relationships: positive” . In our treatment which includes “uncoded” as an additional category, x57 corresponds to the number of quasi-sentence units left “uncoded”. For a given manifesto, we denote the total number of quasi-sentences as n. For the probabilities that a quasi-sentence i will be coded into the jth category, we use p j , . . . pk , where p j ≥ 0 for j = 1, . . . , k and ∑kj=1 p j = 1. If we consider that each quasi-sentence’s category is determined independently from that of other quasi-sentences, then we can characterize the CMP’s manifesto coding scheme as corresponding to the well-known multinomial distribution with parameters n and p . The probability for observing counts of given categories of quasi-sentences is then given by the multinomial formula:

Pr(X j = x j , . . . , Xk = xk ) =

  

xk x1 n! x j !···xk ! p1 · · · pk

  0

when ∑kj=1 x j = n

(1)

otherwise

Translating into the CMP coding scheme, (for a given manifesto) each pk represents the “true” proportion of codes that occur in a given category j. If we consider the “PER” or percentage categories reported by the CMP for each manifesto, then what is being reported is pˆ j 100, or the estimate of each manifesto’s “true” percentage of the quasi-sentences from category j. So far we have considered only the case of a “given” manifesto, but of course the CMP dataset set deals with many such units—a total of 3,018 separate units representing different combinations of country, election date, and political parties for the combined (MPP + MPP2) datasets.6 For our purposes, we are interested in estimating the probabilities represented by the k-element parameter vector P for each manifesto. If we index the manifestos by i, then this means our quantities of interest are then pi j , . . . pik for each manifesto i. In this paper we are not concerned with the systematic factors that influence each pi j , but rather with the variance of these quantities. Nor are we concerned with a substantive critique of the representation of the world of party policy represented by the 56-category CMP scheme, which we take as given for our purposes here. Despite the clearly observable and striking variations in manifesto length 6 It is not quite accurate to state that the dataset represents 3,018 separate manifestos, since some of these countryelection-party units share the same manifesto with other parties (progtype=2) or have been “estimated” from adjacent parties (progtype=3). See Appendix 3, MPP. The full CMP dataset also failed to provide figures on either total quasi-sentences or the percentage of uncoded sentences for 141 manifesto units, limiting the sample analyzed here to 2,877 units.

7

and content, in other words, we know of no systematic expectation, based on empirical evidence or even pure theory, about the nature of the stochastic process that maps actors’ (unobservable) political positions to observed manifesto content. The process whereby the unobserved parameters πi j are generated, in other words, is something of a mystery that we simply assume to be revealed by the coded manifesto dataset itself. Our substantive model of the observed proportions p, therefore, is exceedingly simple:

E(pi j ) = πi j

(2)

where each πi j represents the true and unobservable policy position of country-party-date unit i on issue j. For our purposes (and those of the CMP or almost any other enterprise setting out to measure unobserved policy positions of political actors) we consider πi j as something fixed but real. The first way that error can be introduced into the process, then, is through the imperfect and stochastic mapping of the underlying position πi j into observed proportions pi j that generate stochastically determined category counts Xi j . In statistical terms, Var(pi j ) > 0, even though πi j is fixed at a single, unvarying point.

3.2

Stochastic manifesto generation

Figure 1 summarizes our characterization of the text generation process. This begins on the left hand side with an underlying policy position that party decision-makers wish to communicate—paying no attention in this context to whether the position to be communicated is “sincere” or strategic, or to whether there is internal party conflict over the position. Assuming there is a unique position to be communicated—which is fundamentally unobservable by anyone else but common knowledge to party authors—a manifesto text is generated to do this job. Our point is that many different texts could be generated to communicate the same position. There is nothing “golden” about the actual published text, although the generation of policy texts is a mysterious and idiosyncratic process, about which our best information is largely anecdotal. Thus, for example, we do know that drafts of eventually-published policy texts are typically amended many times, often right up the last minutes before publication. This means there are often many different actual texts , only one of which is eventually published. To this we can add the very many conceivable policy texts that could have been drafted to communicate precisely the same position. Consider, for example, the universe of possible 10,000-word texts that could be generated using a 5,000-word political lexicon. There is a gigantic number of possible different texts in this universe and it would clearly be ridiculous to claim that every single one of these communicates a 8

different policy position. [FIGURE 1 ABOUT HERE] This problem is compounded by the fact that policy texts do not have a constant length ni , even though text length might be subject to general norms and conventions. One thing clear from any investigation of the CMP dataset, for example, is that the lengths of the (coded) manifestos vary significantly, as shown in Table 1. Some are quite short, with the shortest manifestos consisting, astonishingly, of fewer than 10 quasi-sentences. At the other extreme, some very long manifestos contained more than 2,000 quasi-sentences, and four had even more than 5,000. Most manifestos were much terser, however, with a median length of 171 quasi-sentences. The middle half of all manifestos were between 83 and 432 quasi-sentences in length; yet this implies that a fourth of all manifestos were fewer than 83 quasi-sentences in length. [TABLE 1 ABOUT HERE] The total number of quasi-sentences found in any one manifesto appears to be—at least for lack of any systematic information or prior expectations on this topic—unrelated to any of the political variables of interest. Yet assuming that the proportions πi j remain the same irrespective of document length, then increasing the length of a manifesto should also increase our confidence about the estimates of these proportions. This reflects one of the most fundamental concepts in statistical measurement: the notion that uncertainty about an estimate should decrease as we add information to that estimate. [TABLE 2 ABOUT HERE] The problem of short manifestos and thus of sparse data is compounded by the fact that not all quasi-sentences can be coded by the CMP scheme. For instance, while immigration is self-evidently an important subject of political debate these days in many different countries, there is no unambiguous coding category in the CMP scheme in which statements about immigration, asylum-seekers, or labor market restrictions to foreigners can be recorded. Table 2 describes the frequencies of manifestos with varying levels of uncodable content, as a percentage of the total number of quasi- sentences. While the median percentage of uncoded content is quite low, at 2.1%, the top quarter of all manifestos contained 8% or more of uncoded content, and 10 percent contained 21% or more uncoded content. The saliency theory of party competition that motivated the original CMP project led to the methodological assumption that party policy in particular areas can be measured in terms of the percentage of total manifesto content coded into each category—in effect assuming that ∑ j πi j = 1. This means that increasing the proportion of uncodable content will necessarily decrease the estimated proportions of coded content.

9

This problem has been noted previously (e.g. Kim & Fording 1998) in the context of the left-right scale in particular, but afflicts all categories, particularly the “pure salience” categories such as PER501, a positive mention of the environment. Since (unpredictably variable) uncoded content affects the policy estimates that are calculated from coded content, it does introduce a further stochastic element in the manifesto generation process.7 The right hand side of Figure 1 shows what happens between the generation of an observed party manifesto and the eventual characterization of this document in the CMP dataset—the data that are so widely used in the profession. Here, we set aside the problem of locating the correct document to code, in a setting where there is no single “unambiguous” party manifesto, pausing only to note that this problem is likely to become more serious over time rather than less so. This is because, as we move increasingly towards multi-media communication, the printed word seems less likely to be seen as the “one true text”. Elsewhere in our own work on the analysis of party manifestos, for example, we have found sharp discrepancies between the text of an official statement of party policy as posted on the party web site, and the text of the printed party manifesto. In such cases, which will become more common, a serious issue arises as to which “official” text to analyze. Assuming we have identified the one true text, however, quite understandable real-world resource constraints meant that most of the manifestos that form the basis of the CMP dataset were coded once, and once only, by a single coder. Ideally, of course, such text would be coded many times by many coders, in which case we would be very confident indeed in predicting that not all coders would code every quasi-sentence in every manifesto in precisely the same way. There is thus a clear potential for stochastic variation in how coders code texts. Some of this potential variation arises from mundane, but in a large scale project almost inevitable, mistakes in either coding or data recording. More tellingly, some of the potential variation arises because allocating quasi-sentences to qualitative coding categories is a fundamentally subjective process, which clearly implies that different coders might well make different judgment calls in the same circumstance. The response to this latter problem by the CMP was to devote increasing levels of care and attention to training its hired coders, with the objective of reducing the variance in human coders’ application of the same coding scheme to the same “ master” text, selected by the CMP and reproduced in MPP2.8 7 When

uncoded content consists of purely non-political sentences, then this still affects the political sentence codings because of the additive assumption that position is determined by the proportion of mentions. In the context of discrete choice models such as the multinomial logit, this problem is known as a violation of the IIA principle. When uncoded content consists of political content that the CMP scheme cannot capture, the problem is likely to be a more serious problem of bias, although we do not investigate that possibility here. 8 There is also the problem that the reliability of coder training—and hence the level of measurement error attributable to

10

[TABLE 3 ABOUT HERE] This training process does, however, generate some limited empirical evidence about levels of variation in the qualitative judgments made about the same text by different coders. This is reported in various publications by Andrea Volkens, who supervised this process over a long period of time (Volkens 2001a, Volkens 2001b, Volkens 2007). Table 3 shows the published information on this matter, which is reported in terms of a particular set of correlations. Taking coding categories as “cases” and coders as “variables”, there is an observed percentage of the common text that each coder codes into each category. The CMP reports the correlation between two coders, in this sense, and in particular correlations between the judgments made by a production coder and those seen as the “master” codings of the training text. Table 3 shows that these correlations range from range from .71 to .88, with the first figure relating to untrained coders and correlations in the region of 0.85–0.88 reported for trained coders on their second contract with the CMP. Thus, taking the highest inter-coder correlation reported for the most experienced CMP coders and ignoring the possibility of coder bias, about 77 percent of the variance in the coding of one analyst is retrieved by the other. This estimate is conservative, since it uses marginal totals for coding categories and thus ignores the possibility of self-canceling coding errors, although Volkens (2001a, 38; and personal communication) believes these to be quite rare. Even given this rather limited empirical evidence, we are left somewhat in the dark since we simply do not know, without careful investigation, the effect of taking account of inter-coder correlations of .71 or even .90 on the level of noise in the final dataset. The ideal way to drive towards more reliable estimates, assuming no coder is perfectly reliable, is to have multiple coders code each text and then combine this information to produce some form of estimate with associated uncertainty. The resource implications of this suggestion are enormous, however, and it is not even clear that all of the 3000+ source documents already coded are physically available for feasible reanalysis. In what follows, we set out an alternative approach, which is to leverage the empirical information we actually do have about variations in text lengths and coding output, using this to parameterize simulations of the error process we have described above. As we shall see, the information we already have can be analyzed to give us a fairly good idea about how reliable any given coding of any given manifesto is likely to be. coding differences—has changed over time. Although we do not model this in any way, the manifestos coded earliest in the project were more prone to error than manifestos coded later, as experience increased and as methods for selecting and training coders became better established. See Volkens (2001a) for details.

11

4 4.1

Estimating Error in Manifesto Generation Analytical error estimation

One way to assess the error variance of any of the 56 PER categories is using an analytic calculation derived from the multinomial formula (Equation 1). The target is to determine (analytically) the variance of the empirical proportions reported by the CMP, which in the language described above represent πˆ i j 100 for each category j and each manifesto i. Here we are assuming no bias (i.e. Equation 2), where each πi j represents the true and unobservable policy position of country-party-date unit i on issue j. What this means (practically) is that we can use the CMP-provided proportions as the best point estimate of the unobservable proportions πi j . What we would like to estimate are standard errors (or variances) of the reported proportions pi j 100. (The CMP reports 56 of these proportions as the “PER” categories, and they are they ones we take here to be our expectations of the “truth” measured without bias, as per Equation 2.) The question then is what would be the variance of this percentage quantity, or Var(pi j )? For any given Xi j from the multinomial distribution,

Var(Xi j ) = np j (1 − p j )

(3)

With a bit of algebraic manipulation (detailed in Appendix A) we can express the variance of the proportion p j , and the rescaled percentage (used by the CMP as):

Var(p j ) = SD(p j 100) =

1 p j (1 − p j ) n q 100 √ p j (1 − p j ) n

(4) (5)

SD(pi j ) ∝ 100ni−2

In part, then, the error will depend on the size of the true percentage of mentions pi j 100 for each “ PER” category j. Assuming this quantity is fixed for each party-election unit i, however, what is variable as a result of the data generating process is the length ni of the manifesto that the party associated with i deposits. This aspect of the error in the CMP estimates, therefore, is inversely proportional to the (square root of the) length of the manifesto. This should be reassuring, since it means that longer manifestos will reduce the error in any CMP category, irrespective of its proportion p j . Longer manifestos provide more information, in other words, and therefore we can be more confident about codes constructed from

12

them. The same will also be true for additive scales constructed from CMP coding categories, such as the left-right scale referred to as RILE in CMP data, although we must replace the proportion pi j with piRILE = ∑Rr=1 pir where r is an index for the R-element set of the 26 categories which make up the RILE index. (See Appendix A.)

4.2

Estimating Error Through Simulation

It is also possible to assess the extent of error in the CMP estimates using simulations that recreate the stochastic process that led to the generation of each manifesto, based on our belief that there are many different possible manifestos that could have been written to communicate the same underlying policy position. This is done by bootstrapping the analysis of each coded manifesto, based on resampling from the set of quasi-sentences in each manifesto reported by the CMP. Bootstrapping is a method for estimating the sampling distribution of an estimator by resampling with replacement from the original sample. Bootstrapping has three principal advantages over the analytic derivation of CMP error from the previous section. First, it does not require any assumption about the distribution of the data being bootstrapped and can be used effectively with small sample sizes (N < 20) (Efron 1979, Efron & Tibshirani 1994). Second, because we simulate the process of generating manifesto codings from a set of unobservable fixed policy positions πi j , bootstrapping allows us to intervene at other stages of the procedure to simulate other forms of error such as coding error, differences in quasi-sentence parsing, or errors introduced when paper documents were keyed into a computer file. Third, bootstrapping can be used directly on additive indexes such as the CMP “right-left” scale without making the assumptions about the covariances of these categories that would be required to derive an analytic variance. Since the exact covariances of these categories are unknown, highly sample dependent, and very probably also influenced by coder errors, we always recommend the bootstrapped error variances over an analytic solution for additive CMP measures. The bootstrapping procedure is straightforward. Since the CMP dataset contains percentages of total manifesto sentences coded into each category, as well as the raw total number of quasi-sentences observed, we can convert reported percentages in each category back into raw numbers. This gives us a new dataset in which each manifesto is described in terms of the number of sentences allocated to each CMP coding category. We then bootstrap each manifesto by drawing 100 different random samples (with replacement), each of ni quasi-sentence codings, using as a sampling frame the CMP’s

13

coded set of ni coded quasi-sentences for the manifesto in question.9 Each (re)sampled manifesto looks somewhat like the original manifesto and has the same length, except that some sentences will have been dropped and replaced with other sentences that are repeated. We feel this is a fairly realistic simulation of the stochastic text generation process. The nature of the bootstrapping method applied to texts in this way, furthermore, will strongly tend to reflect the intuition that longer (unbiased) texts contain more information than shorter ones. One problem that is not addressed by bootstrapping the CMP manifesto codings is that, as anyone who has a close acquaintance with this dataset will know, it is typical for many CMP coding categories to be empty for any given manifesto—resulting in zero scores for the variable concerned. No matter how large the number we multiply by zero, we still get zero. Thus a user of the CMP data dealing with a 20-sentence manifesto that populates only 10 coding categories out of 56 must in effect assume that, had the manifesto been 20,000 sentences long, it would still have populated only 10 categories. In extremis, if some manifesto populated only a single CMP coding category, then every sampled manifesto would be identical. We cannot get around this problem with the CMP data by bootstrapping, unless we make some arbitrary assumptions about probability distributions for non-observed categories, and we are very reluctant to do this. The great benefit of bootstrapping the CMP estimates in this way, to simulate the stochastic process of text generation, is that we can generate the standard errors and confidence intervals associated with the estimated value for each coding category and also for scales generated by combining these categories, such as the CMP additive left-right scale, or more specific scales measuring party positions on social, liberalism or economic policy. [FIGURE 2 ABOUT HERE] The results of using this bootstrapping procedure are reported in Figure 2, where we show the relationship between manifesto length, measured in quasi-sentences, and the bootstrapped standard error of both the additive CMP left-right scale, as well as the error estimated for PER501, the “pro-environment” category.10 The first panel (a) of Figure 2 shows the effects of the bootstrapped measure for PER501, the single category indicating a pro-environmental stance. As predicted in the previous section, its error variance declines directly with (logged) manifesto length. (The wide differences (indicated by the thick 9 In

practice this is done in R using a routine for generating multinomial draws, taking pi as given from the reported PER categories and ni given by the “ total” variable reported in the CMP dataset. 10 Given the distribution of the data, both axes of Figure 2 uses logarithmic scales and the figure demonstrates clearly that the level of bootstrapped error in this scale error is very strongly related to manifesto length—short manifestos have much more potential for error of this type than the long ones.

14

band) reflect the very different proportions of piPER501 for different manifestos i.) In panel (c), we compare the bootstrapped error variance to the variance computed analytically (per Equation 5) for the Environment (PER501) category. The near equivalence of these two methods of obtaining the standard errors serves to validate both the analytical derivation of the CMP error variance, as well as prove that bootstrapping the manifestos from their quasi-sentences is a valid way to simulate error variance in the categories. The bootstrapped method, in other words, faithfully reproduces the error we have identified using the analytic variance based on the multinomial distribution.

5

Estimating Error in Manifesto Coding

The second type of error, separate from the stochastic process by which manifesto texts are generated from fixed policy positions, lies in the stochastic assignment of codes by humans to the fixed manifestos. The manifesto coding manual provided in Appendix II to MPP2 (164–194) identifies four steps in the coding procedure: identifying quasi-sentences, classifying the quasi-sentences, coding protocol for all country-specific categories and coding, and compiling the coding sheet. We believe that it is reasonable to assume that possible coder errors in the final two stages of the procedure will remain random. However possible mistakes in the first two stages can result in measurement error and person-specific bias. In our application we estimate error variance by partially modeling the stochastic features of this manifesto coding process. Regarding the first stage of identifying quasi-sentences, Volkens (2001b, 38) reports that average deviation from correct quasi-sentence structure in reliability tests of thirty-nine coders employed in the CMP was around ten percent. Only fifteen coders deviated from the standard quasi-sentence structure by more than ten percent, of which only three coders had a deviation between twenty and thirty percent. At the next stage of coding, when each quasi-sentence is classified into one and only one coding category of the fifty-six category coding scheme, three particular issues have been identified by the CMP group: no category seems to apply to the quasi-sentence, more than one category seems to apply, and the statement in the quasi-sentence is unclear (MPP2, 170). Several solutions to these problems are offered in the coding manual. When no category seems to apply the quasi-sentence may be marked as uncodable, with categories used seldom being the most difficult to code (MPP2, 170). When more than one category seems to apply the manual offers eight decision rules ranging from re-reading the categories description, identifying connecting sentences, creating sub-categories, checking section headings as cues to more explicit rules that specific categories should be chosen ahead of more general categories (e.g. PER305 15

“political authority” and PER408 “economic goals”). When the statement seems unclear the coder is advised to seek cues from the context and/or contact the supervisor in Berlin. Two key implications can be drawn from this discussion in the CMP coding manual. The first relates to choices that a coder faces when coding policy specific statements that require background knowledge different from the one possessed by the coder. A manual coder in production coding (that is, coding several manifesto over a bound period of time) may use more general categories rather than policy specific categories. The choice may simply be dictated by coder’s lack of confidence over the applicability of a particular policy specific quasi-sentence to relatively vague coding scheme vignettes supplied. Thus, a coder may “play it safe” and opt for a general category that is often used in similar circumstances. Although the manual stresses the importance of using policy specific categories, we are not sure about its success rate. Comparing frequency distributions of codes by coder to the their distribution in the general population of codes may be an indicator of coders’ personal styles. The second implication refers to the possibility of several categories being related to the same underlying idea and result in “coding seepage” between categories. In other words, the coding process may be subject to a degree of inherent misclassification, a process likely to cause bias whereby some true categories πi j are over-represented by empirically observed categories pi j . While this problem is widely acknowledged in CMP work (e.g. MPP2, 116; Volkens (2001a, 38)) and certainly adds additional error to the CMP dataset, here we stick with the assumption of Equation 2 which states that whatever misclassification problems might exist in the coding process, they do not result in systematic bias. An analysis of the results from repeated codings of the training document used by the CMP to initiate new coders, shown in Table 3 reveals important insight into both processes.11 First, it suggests what we already know, that different coders assign the same quasi-sentence to different categories. Second, it suggests that even using the best estimates from coders who have been corrected and given a second chance to get it right, the very best reliability estimates are still only in the region of 0.88-0.90. Finally, it also reveals just how different are the judgments made during the first step of parsing a document into “quasi”-sentences. In the tests we have analyzed, in fact, tests using 59 coders counted quasi-sentences from 127 to 211, with a SD of 19.17 and an IQR of (148, 173). Furthermore, as mentioned earlier, an inter-coder variance in counting quasi-sentences in the tests is typically +/-10% (Volkens 2001b, 38). While we do not take the full step here of attempting to implement (and correct) for a misclas11 We

thank Andrea Volkens for kindly sharing not only this dataset with us, but also her vast experience in dealing with different coders over the many years of the CMP. In reporting the variance of the quasi-sentence length of the training document, we have deliberately not reported the “correct” text length since this document is still used by the CMP for training coders.

16

sification matrix, we can simulate the effect of variable quasi-sentence length. Once again assuming that the expected value of the observed total count of quasi-sentences for a manifesto recorded by the CMP reflects the “true” number of quasi-sentences, we can add noise to the bootstrapping procedure in the form of varying the number of quasi-sentences by approximately +/-10%. For our purposes, this will mean drawing the number of quasi-sentences from a random normal distribution with a mean of k X , and a standard deviation of 0.10 of the mean (assumed to be the total quasi-sentences ni = ∑ X j=1 ij

reported in the CMP dataset). [FIGURE 3 ABOUT HERE] The results are shown in Figure 3, which plots the bootstrapped error without randomly drawing the simulated ni vector, versus those where it has been randomly drawn. As can be seen by the clustering about the unit axis, this procedure introduces additional noise but no evident bias. In analyses in the next section that use error estimates in applied research, we will generally use the bootstrapped error with the randomized ni as our best approximation of overall non-systematic error in the CMP’s reported estimates.

6

Using CMP Error Estimates in Applied Research

There are two main research settings in which scholars need to estimate the policy positions of political actors. The first is cross-sectional: a map of some policy space is needed, based on estimates of different agent positions at the same point in time. The second is longitudinal: a time series of policy positions is needed, based on estimates of the same agent’s policy positions at different points in time. As we have noted, many alternative techniques are available for estimating cross-sectional policy spaces. The signal virtue of the CMP data is that it purports to offer time series estimates of the policy positions of key political agents. Clearly, either cross sectional or time series estimates of policy positions convey little or no rigorously usable information if they do not come with associated measures of uncertainty. Absent any such measure, estimates of “different” policy positions may either be different noisy estimates of the same underlying signal, or accurate estimates of different signals. We have no way to know which is the case; yet we need to know if we are to draw valid statistical inferences based on variation in our estimates. Drawing such inferences is, of course, the entire purpose of conducting empirical research in the first place.

17

6.1

Estimating valid differences

A substantial part of the discussion found in MPP and MPP2 of the face validity of the CMP data comes in early chapters of each book, during which the policy positions of specific parties are plotted over time. Sequences of estimated party policy movements are discussed in detail and held to be substantively plausible, with this substantive plausibility taken as evidence for the face validity of the data. But are such estimated changes in party policy “real”, or merely measurement noise? We illustrate how to answer this question with a specific example taken from Germany, the country in which the CMP has been based for many years. Figure 4 plots the time series of the estimated environmental policy positions of the CDU-CSU (PER 501 in the CMP coding scheme). The dashed line shows the CMP estimates, the error bars show the bootstrapped 95 percent confidence intervals around these estimates, calculated as described above. [FIGURE 4 about here] We see clearly from this figure that the error bands around the CMP estimates are large and thus that most of the estimated “changes” in the CDU-CSU’s environmental policy over time could well be noise. Statistically speaking, we can conclude that the CDU-CSU was more pro-environmental in the early 1990s than it was either in the early 1980s or the early 2000s, but that every other observed “movement” on this policy dimension could well be caused by measurement error. [TABLE 4 about here] Table 4 reports the result of extending this anecdotal discussion in a much more comprehensive way. It deals with observed “changes” of party positions on the CMP’s widely-used left-right scale (RILE) and thus systematically summarizes all of the information about policy movements that is used anecdotally, in the early chapters of MPP and MPP2 to justify the face validity of the CMP data. The table reports, considering all situations in the CMP data in which the same party has an estimated position for two adjacent elections, the proportion of cases in which the estimated policy “change” from one election to the next is statistically significant. These results should be of considerable interest to all third-party researchers who use the CMP data to generate a time series of party positions and show that observed policy “changes” are statistically significant in only about 28 percent of the relevant cases, or slightly more than one in four. We do not of course conclude from this that the CMP estimates are invalid. But we do conclude that many of the policy “changes” that have hitherto been used to justify their validity are not statistically significant and may well just be noise. More generally, we argue that, if valid statistical (and hence logical) inferences are to be drawn from “changes” over time in party pol-

18

icy positions, estimated from the CMP data, then it is essential that these inferences are based on valid measures of the uncertainty in the CMP estimates, which have not until now been available. [FIGURE 5 about here] The other type of estimated policy difference that interests scholars analyzing party competition, as we have seen, concerns cross-sectional differences between different parties at the same point in time. Considering some static spatial model of party competition that is realized by estimating the positions of actual political parties at some particular time, many of the model’s implications will depend on differences in the policy positions of the various parties. It is crucial, therefore, when estimating a cross-section of party policy positions, to know whether the estimated positions of different parties do indeed differ from each other in a statistical sense. Figure 5 illustrates this problem for recent elections in Germany (top panel) and France (bottom panel), giving estimates for 2001 of party positions on the CMP left-right scale. These show us quite clearly, for the case of Germany, that the five political parties under investigation took up only three statistically distinguishable positions. Taking account of the uncertainty of the estimates, the Social Democrats and Free Democrats have statistically indistinguishable estimated positions on the CMP left-right scale, as do the Greens and PDS. The situation is even “muddier” in France where four parties—Communists, Socialists, Greens and UMP—have statistically indistinguishable estimated positions on the CMP left-right scale in 2001. Only the far-right FN had an estimated left-right position that clearly distinguished it from all other parties. To conclude this discussion, we can see that deriving measures of uncertainly for the CMP estimates is no small matter. On one hand it considerably extends our confidence in the inference that two party positions differ—cross-sectionally or longitudinally—if this difference can be shown to be significant in a statistical sense. On the other hand we see that many of the policy “differences” on which third-party authors have hitherto relied in their empirical work may well be just noise. This stands to reason. It has never been a secret that some of the manifestos on which the CMP estimates are based are very short indeed, and thus convey rather little information, while other manifestos are quite long, and thus tend to convey much more information. Notwithstanding this, users of CMP data have hitherto treated all estimates as if they conveyed precisely the same amount of information. As we have just seen, using the measures of uncertainty we develop in this paper makes a big difference to the statistical inferences we draw from the CMP estimates, whether in a longitudinal or cross-sectional setting.

19

6.2

Correcting estimates in linear models

When covariates measured with known error are used in linear regression models, the result is bias and inefficiency when estimating coefficients on the error-laden covariates (Carroll, Ruppert, Stefanski & Crainiceanu. 2006). The usual assumption is that coefficients will suffer from “attenuation bias” – they are likely to be biased towards zero, thus underestimating the relevant variables’ true effect.12 The CMP estimates of party policy positions clearly have some error, even on the CMP’s own published accounts of inter-coder reliability. As we have shown above, it is possible to derive measures of the effect of some of the error processes involved in generating the CMP data. Nearly all empirical research that relies on CMP estimates as measures of party policy positions uses these estimates as covariates in linear regression models. Of all of the studies that have used CMP data as covariates in linear regression models, however, to our knowledge not a single one has explicitly taken account of the likelihood of error in the CMP estimates, or even used the length of the underlying manifesto as a crude indication of potential error. As a result, we expect many of the reported coefficients in the studies that use CMP estimates as covariates to be biased and inconsistent. We address this issue by replicating and correcting three recent high-profile studies that use the CMP data, each published in a major political science journal: Adams, Clark, Ezrow & Glasgow (2006, AJPS), Hix, Noury & Roland (2006, AJPS) , and Golder (2006, BJPS). Each study uses CMP-based estimates of policy as explanatory variables, and each uses the measures in a slightly different fashion. In all three cases we obtained the datasets (and replication code) from the authors and replicated the analyses correcting for measurement error in the CMP-derived variables. We do this using a simple error correction model known as simulation-extrapolation (SIMEX) that allows generalized linear models to be estimated with correction for known or assumed error variance for error-prone covariates (Stefanski & Cook 1995, Carroll et al. 2006). SIMEX is a method to estimate and reduce the effect of measurement error on an estimator experimentally via simulations. The essence of the method lies in a re-samplinglike stage, where extra measurement error is added to the model in order to establish a trend in bias due to measurement error relative to the variance of the added extra measurement error. Following this trend SIMEX then extrapolates back to the case of no measurement error. SIMEX is particularly 12 The assumption of attenuation bias in models with error-contaminated covariates has even been called the “Iron Law of Econometrics” (Hausman 2001, 58). However, Carroll et al. (2006, 41) suggest that conclusion of attenuation bias has to be qualified, because it depends on the relationship between true covariate, its observed proxy, and possibly other covariates in the model. Moreover, in case of multiple variables containing measurement error it will also depend on the relationship between measurement error components of these error-contaminated variables. Thus, in linear regression depending on whether it is a simple or multiple regression, single or multiple covariates measured with error, and whether measurement error contains bias, the results will range from simple attenuation bias to cases where real effects are hidden, or relationships found in the observed data are nonexistent in true data, or the reversal of the signs of coefficients compared to the case with true data.

20

useful for models with additive measurement error. More complicated solutions are possible, but here we deliberately chose a method that is simple, applicable to a wide class of general linear models, and for which freely available software is available that can be used with popular statistical packages.13

6.2.1

Adams, Clark, Ezrow and Glasgow (2006)

Adams et al. (2006) are concerned with whether “niche” parties (for example Communists, Greens or extreme-right) are fundamentally different from mainstream parties. The first model investigated in the article focuses on the question whether such “niche” parties adjust their policies in response to public opinion shifts. The authors operationalize party policy shifts between elections as changes in a party’s CMP left-right scale (“Rile”) in adjacent elections. This measure is then regressed on public opinion shifts, a dummy variable for niche party status, and the interaction of these two variables, plus lagged policy shifts, lagged vote share change, the interaction of these two terms, and a set of country dummies.14 In our replication of Adams et. al. Table 1, we focus on the effect of measurement error in lagged policy shift and its interaction with lagged vote share change, since error effects in covariates are what is usually considered in the classical measurement error model. Our estimate of error for lagged policy shift is based on the bootstrapped standard error. For the interaction term of lagged policy shift and lagged vote share change, we assume that change in vote share is measured without error. In this and in the replications that follow, our error estimates for each error-prone covariate is the mean of the in-sample average error variance from the bootstrapping procedure with randomized text lengths (and specified in the note to each table). The second model in Adams et al. (2006) tests whether policy adjustments tend to enhance parties’ electoral support. The authors model change in party’s vote share as a function of parties’ policy shifts towards the center of the voter distribution and away from it, and how it differs between mainstream and niche parties. The key explanatory variables are derived directly from CMP and thus are expected to be error-prone: Centrist policy shift, Noncentrist policy shift, Niche Party * Centrist policy shift, Niche 13 For R, the simex package is available from CRAN. Information on SIMEX implementation in STATA can be found at http://www.stata.com/merror/. 14 We expect to find error contamination in both dependent variable on the left-hand side, its lagged value on the right-hand side, and an interaction of lagged dependent variable and lagged change in vote share. In order to remain within the classical measurement error (CME) domain we assume that error in first-differences in the dependent variable is uncorrelated with error in second-differences in its lagged value. Within CME it is known that measurement error in the dependent variable, if uncorrelated with other covariates, will only inflate standard error of the regression (Wooldridge 2002, 711-72). Moreover the effect of measurement error in first-difference estimation in panel data models is much higher than in level models, which may somewhat explain low reported R2 .

21

Party * Noncentrist policy shift. The first variable denotes the shift in party’s policy position towards the mean voter position, and it is calculated as the absolute value of the change in party’s position on the CMP left-right scale when a leftist party shifts right or rightist party shifts left and zero otherwise. Non-centrist shift concerns the shift away from the center from election t − 1 to t, and it is calculated as the absolute value of the change in party’s position on the CMP left-right scale when a leftist party shifts left or rightist party shifts right and zero otherwise. The next two variables pick up the differences in the electoral effects for niche and mainstream parties in relation to centrist and non-centrist policy shifts. Two additional more control variables are based on CMP measures: Party policy convergence and Party policy convergence * Peripheral party. The former variable is constructed to capture the idea that parties’ electoral success may be affected by policy shifts of other parties. It is operationalized as the sum of all centrist policy moves by all parties in the system. The latter variable is an interaction of Party policy convergence with a dummy variable for peripheral parties. In addition to these six errorprone covariates, Model 2 in Adams et al. (2006) contains dummy variables for niche parties, governing parties, coalition governments, previous change in vote share, and several economic control variables: changes in unemployment and GDP rates and their interaction with governing party dummy. [TABLE 5 about here] Table 5 presents results for two models, taken from the two regression tables from Adams et al. (2006). For each model, we report our replication of the published results, plus the SIMEX errorcorrected results.15 The SIMEX correction of Model 1 strengthens the coefficient on Previous policy shift variable by 40%, and quadruples the coefficient on the interaction term of Previous policy shift and Previous change in vote share. Although error correction does not alter the remaining results greatly it also serves to illustrate a generally a turbid nature of error processes when the coefficient on a key explanatory variable in Model 1 (Public opinion shift) weakens by 7%, and Previous change in vote share becomes statistically significant. Neither Public opinion shift or Previous change in vote share are derived from the CMP both are affected due to their intra-model relationship with error contaminated covariates. In Model 2 a quarter of all variables are measurement error prone. Due to high levels of collinearity we should expect the SIMEX error correction to have a much more pronounced effect on the results than in Model 1. The key explanatory variable Niche Party * Centrist policy shift does not change 15 In

this replication and the ones that follow, we report our replicated estimates instead of the published estimates, which are in each case slightly but not significantly different. This differences arose due to slight errors in data preparation in each published analysis which we uncovered in the course of replicating their results. For full details including our own replication code are available from http://kenbenoit.net/cmp/.

22

significantly (weakens only by 5%) and still suggests that niche parties are punished at the polls for moving towards the center of voter distribution. The SIMEX error corrections unmasks several results that do not appear in the original analysis. Thus the coefficient on Previous change in vote share gains slightly and becomes barely statistically significant. However—and more importantly—the coefficient on Party policy convergence grows by 65% and becomes statistically significant, suggesting the substantive conclusion that when party policy positions move to the center of the policy space centrist parties are electorally penalized more than peripheral parties. Another interesting result concerns the effect of a policy shift away from the center of voter distribution on mainstream parties. The negative and statistically significant coefficient on Noncentrist policy shift is expected within the theoretical framework of Adams et al. (2006) but couldn’t be shown empirically. Here after the SIMEX error correction the coefficient on Noncentrist policy shift almost doubles and becomes statistically significant. This result of the SIMEX error correction on Noncentrist policy shift is shown graphically in Figure 6. [FIGURE 6 about here] Correcting for bias induced by measurement error in Noncentrist policy shift is a good illustration of general principles behind the SIMEX method. In Figure 6, the y-axis represents the size of the coefficient. The amount of noise added to the variance of error-prone covariate is on the x-axis. It is in the form (1 − λ), where λ is a set of discrete values, usually: {−1, −.5, 0, .5, 1, 1.5, 2}. When λ = 0 the model is estimated without any error correction or error inflation—often called naive model. Each node in Figure 6 represents a mean of 100 simulations of the model with each successive increase in λ, viz. an increase in the amount of measurement error added to the model. Once a pattern is established SIMEX extrapolates back to the case of λ = −1, i.e. no measurement error. Among several extrapolants a quadratic function has been shown to have good numerical properties. As becomes clear from Figure 6 while the naive model produced a coefficient on Noncentrist policy shift of around -2.3 (standard error=1.8) and increasing amounts of measurement error brings the size of the coefficient closer to 0, however, the SIMEX correction shows the coefficient to be around -4.3 (standard error=2.3). This is a 84% increase compared to the naive model. Overall, measurement error in linear models either in one or several covariates cannot be simply dismissed as it affects the estimates produced in standard normal-linear models, sometimes in a substantively significant way.

6.2.2

Hix, Noury, and Roland (2006)

Hix, Noury & Roland (2006) are concerned with the content and character of political dimensions in

23

the European Parliament (EP). Following an inductive scaling analysis of all roll-call votes in the EP from 1979 and 2001, Hix, Noury & Roland (2006) attempt to validate their interpretations of these inductively derived dimensions by mapping variation in the dimensional measures to exogenous data on nation party policy positions. The authors conclude that the first dimension is the classic left-right dimension, while the second dimension primarily concerns party positions on European integration as well as government-opposition conflicts. Substantive interpretation of dimensions is achieved by regressing the mean position of each national party’s delegation of MEPs on each dimension in each parliament on two sets of independent variables. The first set of variables includes exogenous measures of national party positions on the left-right, social and economic left-right, and pro-/anti-EU dimensions. The second set relates to government-opposition dynamics and consists of categorical variables whether a national party was in government and whether the party had a European Commissioner, as well as dummy variables for each European party group, each EU member state, and each (session of) European parliament. The measures of national party positions of interest to us here are either taken directly (like leftright) or constructed from the CMP data (EU, economic and social left-right). National party positions on the EU dimension are taken as the difference between positive mentions of EU in party manifesto (category PER108) and negative mentions (category PER110). Party positions on economic and social dimensions follow the suggestion in Laver & Garry (2000, 628-629), where the economic dimension is constructed as the difference between “Capitalist Economics” and “State Intervention” clusters, and social dimension is constructed as the difference between “Social Conservatism” and “Anti-establishment views” clusters from Laver & Budge (1992, 24).16 We expect to find error contamination in all models containing variables derived from the CMP data. [TABLE 6 about here] Table 6 reports our replicated estimations of four models in Hix, Noury & Roland (2006) along with error-corrected measurements based on our bootstrapped variances with randomized text length. In each error-corrected estimation, we used simulation-extrapolation with an error variance estimated as the mean of the in-sample average error variance from the bootstrapping procedure with randomized text lengths. Although Hix, Noury & Roland (2006) present numerous estimations, here we examine only those that used CMP-based policy measures as covariates in their regressions on the first dimension from their inductive scaling procedure. Model 2 is the most basic model regressing their first estimated dimension 16 Thus,

economic dimension = (per401 + per402 + per407 + per414 + per505) - (per403 + per404 + per406 + per412 + per413); social dimension = (per203 + per305 + per601 + per603 + per605 + per606)- (per204+per304+per602+per604)

24

on several covariates: general left-right and EU dimensions taken from the CMP, plus categorical variables whether a national party was in government and whether the party had a European Commissioner, and as dummy variables for each (session of) European parliament. Model 3 replaces the general leftright with two more specific left-right dimensions drawn from the CMP: economic left-right and social left-right. Model 6 extends Model 3 to include dummy variables for each European party group. Model 9 extends Model 6 to include a set of country dummies. The first thing that is apparent from the Table 6 is that the SIMEX error correction had only minimal effect on most coefficients with all error-prone covariates increasing only slightly and most other covariates decreasing slightly. By far the biggest correction is achieved on the coefficient on national party position on EU dimension. This is to be expected, since the ”pro vs anti EU” variable constructed from the CMP data uses only two of the 56 substantive coding categories and is relatively sparsely populated with coded data. The coefficient on Pro-/Anti-EU variable with the SIMEX error correction increased by 78% in Model 2, doubled in Model 3, increased by 2.5 times in Model 6, and by 2.2 times in Model 9. In the latter two models, these large increases in the size of the estimated coefficients means that the Pro-/Anti-EU variable becomes statistically significant, where it failed to do so in the naive models. For example, in Model 3 it would suggest that for the mean of each national party’s delegation on the first dimension to move from the center to extreme right position, ceteris paribus, its position on the EU dimension would have to change by 23 and not by 45 points as in the naive estimates. Party policy positions on EU dimension have an in-sample range of 33. Thus while such a whole-sale move from one end of political spectrum to another seems highly unlikely, it is nevertheless possible under error corrected results. Substantively, for an explanation of the mean position of national party’s delegation its national party’s position on the EU dimension is more important than its position on general left-right or substantive economic and social left-right. Moreover, the SIMEX error correction of Pro-/Anti-EU variable and its subsequent regaining of importance in Models 6 and 9 forces rethinking of some of the conclusions in the article. Hix, Noury & Roland (2006) expect to find the first dimension to be explained by policy positions of political parties on general (domestic) left-right issues, while the second dimension should be explained by policy positions of political parties on the EU dimension.17 Indeed, the estimates from na¨ıve models, viz. not 17 In

the words of Hix, Noury & Roland (2006) interpreting their results from the naive model: EU policies of national parties and national party participation in government are only significant without the European party group dummies. This means that once one controls for European party group positions these variables are not relevant explanatory factors on the first dimension. (502)

25

taking account of measurement error bias in CMP derived variables, provide ample evidence to support original theoretical expectations. Once party groups are controlled for, party positions on EU dimension have no role to play in interpreting the first dimension. However, applying the SIMEX error correction to the error-prone Pro-/Anti-EU covariate, not only more than doubles it in magnitude but also makes it statistically significant. Thus, the first dimension still contains Pro-/Anti-EU and not only left-right as claimed by the authors. This calls into question the conclusion of the article that political parties’ positions in EU policy making process are shaped predominantly by important socioeconomic positions in domestic politics rather than preferences about European integration. Instead, the error-corrected results suggest that there is no clear distinction between general left-right positioning and positioning on European integration when it comes to the primary axis of contestation in the European Parliament, and that if anything, positioning on European Union appears to be even more important than general left-right in explaining this primary axis.

6.2.3

Golder (2006)

Golder (2006) applied non-linear probit models to investigate the formation of pre-electoral coalitions between parties and the forces that may influence the likelihood of such coalitions. Pre-electoral coalitions between dyads of parties in a system are modeled as functions of Ideological incompatibility, Polarization and Electoral threshold (plus the interaction between these two variables), Coalition size and Coalition size squared, Asymmetry, and an interaction between Coalition size and Asymmetry. We can expect three variables to be particularly error-prone, since they are based on the CMP left-right measure: Ideological incompatibility, Polarization and Polarization * Electoral threshold. Ideological incompatibility measures the ideological distance between parties in a dyad. It is directly computed as the absolute value of the difference of the CMP “rile” score for parties in a dyad. We compute error variance for the incompatibility measure combining error variances for individual parties in a dyad utilizing the analytic definition of difference between two uncorrelated variances (see Appendix A). Polarization measures ideological dispersion in a system and is calculated as the difference of the CMP “ rile” scores for the biggest left-wing and the biggest right-wing parties in political system. We compute error variance for Polarization following the same procedure as for Ideological incompatibility. [TABLE 7 about here] Table 7 applies the SIMEX correction to Golder’s probit model predicting the probability of pre-

26

electoral coalition formation.18 The majority of covariates experience only minimal changes after correcting for measurement error. As a result of the SIMEX error correction, the coefficient for Polarization changes sign from positive to negative, but remains statistically indistinguishable from zero. Error correction on Incompatibility results in a 18% coefficient increase, thus making the creation of pre-electoral coalition even less probable with the increase in ideological incompatibility of potential partners. Although none of the substantive results changed with the SIMEX error correction, our implementation of error-correction for this model demonstrates that error correction using our estimates of CMP error variances is straightforward to apply to non-linear models.

7

Concluding Remarks and Recommendations

Our task in this paper has been fundamentally constructive. We have been concerned with a dataset that is, quite understandably, widely used in the profession, despite the fact it is without doubt error prone yet comes with no estimate whatsoever of associated error. This is the CMP dataset on party policy positions, estimated on the basis of human coded content analysis of party manifestos. Because this data has been and will no doubt continue to be widely used by researchers, we regard it as vital that it be used in a statistically rigorous way. Hitherto, third-party authors have either ignored or swept under the carpet the self-evident fact that the CMP data is error-prone, yet comes with no estimate of associated error. The huge volume of CMP source documents, written in many languages, were coded once only by a single human coder and will likely never all be recoded many times by different coders. (Some of them may even be lost). We offer a way around this problem with a method for simulating the level of error in key CMP variables and a way to make use of these simulated errors when deploying CMP variables as covariates in statistical analyses—by far their most common use. In this paper, we have characterized some of the error processes involved in generating these data and then described techniques for simulating the effects of these error processes. Our method of bootstrapping the manifestos from their original set of quasi-sentences, in addition to randomly varying text length to simulate non-systematic coder differences, offers a practical and theoretically supported means to estimate error variances for any single or additive CMP policy score. Based on this method, we offer a “corrected” version of the CMP dataset that has bootstrapped standard errors for key estimates. A full 18 We depart from original estimation in two ways here. First, we use newer CMP dataset made available with the publication of MPP2, which results in some very slight differences in the dataset. Second, we use estimate only a fixed effects probit model, rather than the additional random effects probit model as in the original piece to avoid complicating the computation. Neither change resulted in any substantive differences in our replications versus the original results.

27

version of these corrections is available on our website http://kenbenoit.net/cmp/. From the scale of our estimated error variances for the most commonly used CMP categories, we have shown that many of the apparent “differences” in CMP estimates of party policy positions— differences over time in the position of one party or differences between parties at one point in time—are probably attributable to measurement error. Thus only about one quarter of all CMP-estimated “movements” in parties’ left-right policy positions over time were assessed on the basis of our simulations to be statistically significant. Error in the measurement of covariates is likely to bias regression coefficients, yet none of the large volume of published empirical research using CMP data takes any account whatsoever of this. We replicate three prominent recently-published articles in which error-prone CMP variables are used as covariates, and show how to correct these using a SIMEX error correction model, parameterized by our bootstrapped estimates of likely error. The likely systematic effect of error contaminated covariates is attenuation bias, and this is what we tend to find in each case. Some error-corrected effects are stronger, and more significant, than those estimated in models taking no account of error in the covariates. This can cause substantively important reinterpretation of results in cases where inferences have relied on the lack of significance of some covariates. A good example is what emerges as the potentially flawed inference that national party policy positions on the EU have no influence in their EP roll-call voting behavior, an inference that is reversed once account is taken of error-contamination in the CMP dataset’s sparsely-populated variables measuring EU policy. While we have taken an important first step towards providing a practical and theoretically supported means to estimate non-systematic measurement error in CMP estimates, the solution we provide here is hardly the last word on this topic. Additional investigation into coder differences and/or coder error, for instance, could introduce systematic error leading to bias, which deserves to be investigated more in future work. Additional experiments with coder reliability using additional texts would also provide important information on the second type of error we identify. Finally, other means of implementing error correction models are certainly possible, including Bayesian-MCMC methods that can take into account the unit-specific nature of error in our error estimates. Indeed, we hope that our focus of attention on error in the widely used CMP estimates will stimulate a broader dialogue on measurement error in many of our most commonly used measures in political science, such as opinion survey results, expert survey measures, or other computed quantities. Given our knowledge of measurement error and the wide availability of techniques for dealing with such error, there is no longer any excuse for future scholars to use error-prone measures as if these were error free. 28

A

Computing the Analytic Variance of CMP “PER” Categories

Assume that E(pi j ) = πi j , where each πi j represents the true and unobservable policy position of country-party-date unit i on issue j. For any given Xi j from the multinomial distribution, Var(Xi j ) = np j (1 − p j ). To derive the analytic variance for the percentage quantity Var(pi j 100), we have, dropping the manifesto index i for simplicity:

E(X j ) = np j x j = np j

Var

xj = pj n

1 xj n

= Var(p j )

1 Var(x j ) = Var(p j ) n2 1 np j (1 − p j ) = Var(p j ) n2 1 p j (1 − p j ) = Var(p j ) n Translating into the CMP’s percentage metric (p j ∗ 100):

10, 000Var(p j ) = Var(p j 100) = SD(p j 100) =

10, 000 p j (1 − p j ) n 10, 000 p j (1 − p j ) n q 100 √ p j (1 − p j ) n

A similar procedure can be used to derive error variances for additive categories like the pro-/antiEU scale (PER108-PER110) or for RILE, the additive category obtained by summing 13 categories on the “ left” and subtracting 13 categories on the “right”. (For a description of the process by which these categories were originally chosen, see Laver & Budge (1992).) The covariances between categories must also be estimated, however, as per the property of variance that Var(aX + bY ) = a2 Var(X) + b2 Var(Y ) + 2abCov(X,Y ). There are several strong reasons—including coder practice as well as innate relationships between mentions of related categories—to suspect that these covariances will be non-zero. Rather than attempt to estimate these covariances from finite samples, we prefer the bootstrapping procedure for additive categories.

29

References Adams, James. 2001. “A Theory of Spatial Competition with Biased Voters: Party Policies Viewed Temporally and Comparatively.” British Journal of Political Science 31(1):121–158. Adams, James, M. Clark, L. Ezrow & G. Glasgow. 2006. “Are Niche Parties Fundamentally Different from Mainstream Parties? The Causes and the Electoral Consequences of Western European Parties’ Policy Shifts, 1976–1998.” American Journal of Political Science 50(3):513–529. Alt, James & David Lassen. 2006. “Transparency, Political Polarization, and Political Budget Cycles in OECD Countries.” American Journal of Political Science 50(3):530–550. Baron, D.P. 1991. “A Spatial Bargaining Theory of Government Formation in Parliamentary Systems.” The American Political Science Review 85(1):137–164. Bartolini, S. & P. Mair. 1990. Identity, Competition, and Electoral Availability: The Stabilisation of European Electorates 1885-1985. Cambridge University Press. Benoit, Kenneth & Michael Laver. 2006. Party Policy in Modern Democracies. London: Routledge. Bräuninger, T. 2005. “A partisan model of government expenditure.” Public Choice 125(3):409–429. Budge, Ian, Hans-Dieter Klingemann, Andrea Volkens, Judith Bara & Eric Tanenbaum. 2001. Mapping Policy Preferences: Estimates for Parties, Electors, and Governments 1945–1998. Oxford: Oxford University Press. Carroll, Raymond J., David Ruppert, Leonard A. Stefanski & Ciprian M. Crainiceanu. 2006. Measurement Error in Nonlinear Models: A Modern Perspective. Number 105 in “Monographs on Statistics and Applied Probability” 2nd ed. Boca Raton, FL: Chapman and Hall/CRC. Duso, T. & L.H. Röller. 2003. “Endogenous Deregulation: Evidence from OECD Countries.” Economics Letters 81(1):67–71. Efron, Bradley. 1979. “Bootstrap Methods: Another Look at the Jackknife.” The Annals of Statistics 7(1):1–26. Efron, Bradley & Robert Tibshirani. 1994. An Introduction to the Bootstrap. New York: Chapman and Hall/CRC Hall. Erikson, R.S., M.B. Mackuen & J.A. Stimson. 2002. The Macro Polity. Cambridge University Press. Evans, G. & P. Norris. 1999. Critical Elections: British Parties & Voters. Sage Publications. Golder, Sona N. 2006. “Pre-Electoral Coalition Formation in Parliamentary Democracies.” British Journal of Political Science 36(2):193–212. Hausman, J. 2001. “Mismeasured Variables in Econometric Analysis: Problems from the Right and Problems from the Left.” The Journal of Economic Perspectives 15(4):57–67. Hix, Simon, Abdul Noury & Gérard Roland. 2006. “Dimensions of Politics in the European Parliament.” American Journal of Political Science 50(2, April):494–511. Janda, K., R. Harmel, C. Edens & P. Goff. 1995. “Changes in Party Identity: Evidence from Party Manifestos.” Party Politics 1(2):171–196. Kim, Heemin & Richard C. Fording. 1998. “Voter Ideology in Western Democracies, 1946-1989.” European Journal of Political Research 33(1):73–97. 30

Klingemann, Hans-Dieter, Andrea Volkens, Judith Bara, Ian Budge & Michael McDonald. 2006. Mapping Policy Preferences II: Estimates for Parties, Electors, and Governments in Eastern Europe, European Union and OECD 1990-2003. Oxford: Oxford University Press. Laver, Michael & Ian Budge. 1992. Party policy and government coalitions. New York: St. Martin’s Press. Laver, Michael & John Garry. 2000. “Estimating Policy Positions from Political Texts.” American Journal of Political Science 44(3):619–634. Mair, P. 1987. The Changing Irish Party System: Organisation, Ideology and Electoral Competition. Pinter. Meguid, B. 2005. “Competition Between Unequals: The Role of Mainstream Party Strategy in Niche Party Success.” American Political Science Review 99(3):347–359. Milner, H.V. & B. Judkins. 2004. “Partisanship, Trade Policy, and Globalization: Is There a Left-Right Divide on Trade Policy?” International Studies Quarterly 48(1):95–120. Petry, F. 1988. “The Policy Impact of Canadian Party Programs: Public Expenditure Growth and Contagion from the Left.” Canadian Public Policy/Analyse de Politiques 14(4):376–389. Petry, F. 1991. “Fragile Mandate: Party Programmes and Public Expenditures in the French Fifth Republic.” European Journal of Political Research 20(2):149–171. Petry, F. 1995. “The Party Agenda Model: Election Programmes and Government Spending in Canada.” Canadian Journal of Political Science/Revue canadienne de science politique 28(1):51–84. Ray, L. 1999. “Measuring party orientations towards European integration: Results from an expert survey.” European Journal of Political Research 36(2):283–306. Robertson, David. 1976. A Theory of Party Competition. London and New York: Wiley. Schofield, Norman. 1993. “Political competitition and multiparty coalition governments.” European Journal of Political Research 23(1):1–33. Sigelman, L. & H.B. Emmett. 2004. “Avoidance or Engagement? Issue Convergence in U.S. Presidential Campaigns, 1960-2000.” American Journal of Political Science 48(4):650–661. Stefanski, L. A. & J. R. Cook. 1995. “Simulation-Extrapolation: The Measurement Error Jackknife.” Journal of the American Statistical Association 90(432, December):1247–1256. Strom, K. & J.Y. Lejpart. 1989. “Ideology, Strategy, and Party Competition in Postwar Norway.” European Journal of Political Research 17(3):263–288. Thomson, R., J. Boerefijn & F. Stokman. 2004. “Actor alignments in European Union decision making.” European Journal of Political Research 43(2):237–261. Van Der Brug, W., M. Fennema & J. Tillie. 2005. “Why Some Anti-Immigrant Parties Fail and Others Succeed: A Two-Step Model of Aggregate Electoral Support.” Comparative Political Studies 38(5):537–573. Volkens, Andrea. 2001a. Manifesto Research Since 1979. From Reliability to Validity. In Estimating the Policy Positions of Political Actors, ed. Michael Laver. London: Routledge pp. 33–49.

31

Volkens, Andrea. 2001b. Quantifying the Election Programmes: Coding Procedures and Controls. In Mapping Policy Preferences: Parties, Electors and Governments: 1945-1998: Estimates for Parties, Electors and Governments 1945-1998, ed. Ian Budge, Hans-Dieter Klingemann, Andrea Volkens, Judith Bara, Eric Tannenbaum, Richard Fording, Derek Hearl, Hee Min Kim, Michael McDonald & Silvia Mendes. Oxford: Oxford University Press. Volkens, Andrea. 2007. “Strengths and Weaknesses of Approaches to Measuring Policy Positions of Parties.” Electoral Studies 26(1):108–120. Webb, P.D. 2000. The Modern British Party System. Sage Publications. Wooldridge, J. 2002. Econometric Analysis of Cross-sectional and Panel Data. The MIT Press, Cambridge, MA.

32

Total Quasi-sentences (0,5] (5,10] (10,15] (15,20] (20,25] (25,50] (50,75] (75,100] (100,200] (200,250] (250,500] (500,1000] (1000,2000] (2000,5000] > 5000 Total

N 0 14 14 29 41 277 303 251 711 208 521 343 216 79 4

Cumulative 0.0% 0.5% 0.9% 1.9% 3.3% 12.5% 22.5% 30.9% 54.5% 61.4% 78.7% 90.1% 97.2% 99.9% 100.0%

3,011

Table 1: Total quasi-sentences in CMP dataset, for manifestos where total is reported.

Percentage Uncoded 0 (0,5] (5,10] (10,15] (15,20] (20,30] (30,40] (40,50] (50,65] (65,80] (80,95] (95,100]

N 864 1,121 375 193 119 129 87 50 30 9 7 -

Total

2,984

Cumulative 29.0% 66.5% 79.1% 85.6% 89.5% 93.9% 96.8% 98.5% 99.5% 99.8% 100.0% 100.0%

Table 2: Percentage of uncoded quasi-sentences in the CMP dataset, for manifestos where percentage uncoded is reported.

33

Test description Training coders’ solutions with master Training coders’ second attempt with master All pairs of coders Coders trained on 2nd edition of manual First time coders First test of coders taking second contract Second test of coders taking second contract

Mean Correlation 0.72 0.88 0.71 0.83 0.82 0.70 0.85

N 39 9 39 23 14 9 9

Reference Volkens (2001a, 39) MPP2 (2006, 107) Volkens (2001a, 39) Volkens (2007, 118) Volkens (2007, 118) Volkens (2007, 118) Volkens (2007, 118)

Table 3: Coder reliability test results reported by CMP. Sources are (Klingemann et al. 2006; Volkens 2001a, 2007)

Statistically Significant Change? No Yes N/A Total

Elections

Fixed N % total % applicable

Elections

Random N % total % applicable

1,523 576 778

52.9% 20.0% 27.1%

72.6% 27.4% –

1,509 590 778

52.5% 20.5% 27.0%

71.9% 28.1% –

2,877

100.0%

100.0%

2,877

100.0%

100.0%

Table 4: Comparative over-time mapping of policy movement on Left-Right measure, taking into account statistical significance of shifts.

34

Model 1

Model 2

OLS Replication 0.0591 (0.1469)

OLS with SIMEX Correction 0.0836 (0.146)

OLS Replication 2.1212 (2.0939)

OLS with SIMEX Correction 3.7403 (2.4541)

Previous policy shift

-0.5127 (0.0772)

-0.7158 (0.1036)

Previous policy shift * Previous change in vote share

-0.0044 (0.0184)

-0.0171 (0.0189) 1.089 (1.3607)

0.5123 (1.4056)

Noncentrist policy shift

-2.3335 (1.837)

-4.2818 (2.3389)

Niche party * Centrist policy shift

-5.4069 (2.823)

-5.1522 (2.8178)

Niche party * Noncentrist policy shift

2.6163 (2.6946)

3.6774 (2.8245)

Party policy convergence

-1.2513 (0.8416)

-2.0689 (1.058)

Party policy convergence * Peripheral party

1.3238 (1.1628)

0.9724 (1.1732)

-0.1548 (0.0943) -1.4181 (1.8592)

-0.166 (0.0939) -1.856 (1.8926)

5.4118 (1.7649) -3.0767 (1.7271) 0.9764 (1.1872) -0.8533 (0.7263) -0.3958 (0.3977) 0.1045 (1.0674) -0.021 (0.4996) -0.1594 (1.8179) 6.0491 0.2575 0.0832 122

6.0475 (1.8046) -3.114 (1.7206) 1.2103 (1.2048) -0.9707 (0.729) -0.5087 (0.4063) 0.0079 (1.0657) -0.0351 (0.4974) 0.4034 (1.8334) 4.4781 0.2338 0.054 122

Variable Intercept

Centrist policy shift

Previous change in vote share Niche party Public opinion shift Public opinion shift * Niche party Directional public opinion shift

0.0176 (0.0099) 0.1059 (0.1341) 0.8447 (0.2304) -1.5508 (0.4433)

0.0165 (0.0102) 0.1063 (0.133) 0.9008 (0.2338) -1.5675 (0.4488)

Governing party Governing in coalition Change in unemployment rate Change in GDP Governing party * Change in unemployment rate Governing party * Change in GDP Peripheral party RMSE R2 Adj R2 N

0.5908 0.3578 0.2982 154

0.6057 0.3247 0.262 154

Table 5: Results of SIMEX error correction in Adams et. al. (2006). Country dummies are included in the estimations but not reported here. Italicized and grayed-out variables are error-corrected as follows: (Model 1) Previous policy shift=.43, interaction of Previous policy shift and Previous change in vote share=.43; (Model 2) Centrist policy shift=0.228, Noncentrist policy shift=0.181, Party policy convergence=0.847, interaction of Niche party and Centrist policy shift=0.072, interaction of Niche party and Noncentrist policy shift=0.048, interaction of Party policy convergence and Peripheral party=0.390. Coefficients in bold are statistically significant at the p ≤ .05 level; SIMEX standard errors are based on jacknife estimation. 35

Variable

Model 2 OLS OLS Replication SIMEX




Intercept

0.045 (0.0581)

0.044 (0.0584)

-0.149 (0.0596)

-0.181 (0.0604)

0.364 (0.0440)

0.339 (0.0459)

0.332 (0.0667)

0.296 (0.0693)

Pro-/Anti-EU

0.018 (0.0062)

0.032 (0.0076)

0.019 (0.0062)

0.038 (0.0079)

0.004 (0.0037)

0.010 (0.0048)

0.005 (0.0038)

0.011 (0.0049)

Left-Right

0.012 (0.0009)

0.012 (0.0010)

Social L-R

0.011 (0.0023)

0.014 (0.0025)

0.004 (0.0013)

0.005 (0.0015)

0.003 (0.0014)

0.004 (0.0015)

Economic L-R

0.024 (0.0022)

0.024 (0.0022)

0.006 (0.0015)

0.007 (0.0015)

0.007 (0.0016)

0.008 (0.0017)

0.079 (0.0495) 0.103 (0.0434)

0.070 (0.0492) 0.077 (0.0439)

0.022 (0.0304) 0.061 (0.0259) -0.560 (0.0351) -0.641

0.022 (0.0304) 0.056 (0.0259) -0.548 (0.0355) -0.630

0.030 (0.0315) 0.034 (0.0271) -0.556 (0.0350) -0.693

0.031 (0.0314) 0.030 (0.0271) -0.544 (0.0357) -0.671

(0.2115) -0.168 (0.0359) -1.003 (0.0510) 0.077

(0.2114) -0.164 (0.0359) -0.978 (0.0532) 0.071

(0.2097) -0.178 (0.0360) -1.017 (0.0524) 0.119

(0.2104) -0.173 (0.0360) -0.990 (0.0546) 0.107

(0.0993) -0.820 (0.0496) 0.099

(0.0991) -0.786 (0.0527) 0.110

(0.1018) -0.797 (0.0511) 0.079

(0.1019) -0.760 (0.0548) 0.091

(0.0623) -0.230

(0.0622) -0.225

(0.0644) -0.254

(0.0646) -0.245

(0.0539) -0.785 (0.0568) 0.447 (0.1244) 0.2013 0.8136 0.8029 349

(0.0539) -0.765 (0.0577) 0.453 (0.1242) 0.2022 0.8119 0.8011 349

(0.0551) -0.804 (0.0572) 0.393 (0.1245) 0.1923 0.83 0.8122 349

(0.0552) -0.781 (0.0582) 0.409 (0.1243) 0.1935 0.8279 0.8099 349

Commissioner In Government

0.104 (0.0495) 0.107 (0.0434)

0.095 (0.0492) 0.086 (0.0440)

Socialists Italian Communists and allies Liberals Greens British Conservatives and allies Radical left French Gaullists and allies Non-attached members Regionalists Radical right RMSE R2 Adj R2 N

0.3585 0.4089 0.395 349

0.3621 0.3971 0.3829 349

0.3578 0.4112 0.3956 349

0.3645 0.3892 0.373 349

Table 6: Results of SIMEX error correction in Hix, Noury & Roland (2006, 503–504, Table 4). All models include dummies for parliament, and Model 9 includes country dummies but these are not shown. Italicized and grayed-out variables are error-corrected as follows: Left-Right=4.48, Social L-R=2.27, Economic LR=2.06, Pro-/anti-EU=1.83. Coefficients in bold are statistically significant at the p ≤ .05 level; SIMEX standard errors are based on jacknife estimation.

36

Probit Replication

Probit with SIMEX Correction

Intercept

-2.0000 (0.2121)

-1.8220 (0.2267)

Incompatibility

-0.0146 (0.0024) 0.0014 (0.0028) 0.0003 (0.0002)

-0.0173 (0.0029) -0.0026 (0.0034) 0.0005 (0.0002)

0.0170 (0.0067) 0.0409 (0.0088) -0.0004 (0.0001) 0.0243 (0.2455) -0.0226 (0.0071) 1250 3385

0.0114 (0.0072) 0.0408 (0.0088) -0.0004 (0.0001) 0.0113 (0.2467) -0.0223 (0.0071)

Variable

Polarization Polarization * Threshold

Threshold Seat Share Seat Share2 Asymmetry Asymmetry * Seat Share -2*Log-Likelihood N

3385

Table 7: Results of SIMEX error correction in Golder (2006, 203, Table 1, Model 2). Italicized and grayed-out variables are error-corrected with standard deviations of 8.54, 8.25, and 8.25 respectively. Coefficients in bold are statistically significant at the p ≤ .05 level; SIMEX standard errors are based on jacknife estimation.

37

Figure 1: Summary of stochastic processes involved in the generation of policy texts

38

(b)

●

●

● ● ● ●● ●●● ●● ●● ● ●●●● ● ● ●●● ●● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●● ●●● ● ● ● ● ●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ●●●● ● ●● ● ●● ●● ●● ● ● ● ●● ●● ●● ●● ● ●● ● ●●● ●● ●● ● ●●● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ●● ● ● ●● ● ●● ●●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ●● ● ● ●● ● ●● ●● ●● ● ●● ●●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ●● ●●● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ●●●●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●●● ● ●●● ● ● ● ●●● ●● ●● ● ●● ●● ●● ●● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ● ● ●●● ●● ● ●●● ● ●● ●●● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●● ● ●● ● ●● ● ● ●● ● ●●●● ●●● ●● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●●● ● ●● ●● ●●● ● ●● ●● ●● ● ●● ●●● ● ● ●● ●●● ● ● ● ● ● ●●● ● ● ●●● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ●●● ●● ● ●●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●●●● ● ● ●● ● ●●●●● ● ● ●● ● ● ●●●● ●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●● ●●● ● ●● ● ● ● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ●●●●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●

20.0 10.0 1.0

2.0

5.0

RILE Standard Error

●

● ●

● ●

20

50

100

● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ●● ●●

0.5

0.2

0.5

1.0

2.0

5.0

●

●

0.1

PER501 (Environment) Standard Error

10.0

(a)

500

●

2000 5000

10 20

Number of Quasi−Sentences

50

200

500

2000

Number of Quasi−Sentences

●

0.2

0.5

1.0

2.0

5.0

●● ● ● ●●●● ● ●●●●● ●● ● ●● ●● ● ● ● ●● ●● ●● ● ● ●●●●●●●● ● ● ● ●●● ●● ● ● ● ●● ●●● ●●● ●●● ● ● ●● ●●● ● ● ● ●● ●● ● ●●● ● ●● ●●● ● ●●● ● ●● ●● ● ● ●●●● ●●●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ●●●●● ● ●● ● ● ● ●● ●●● ● ●●● ●● ● ●● ● ● ● ● ● ●● ●● ●● ● ●● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●●● ● ●●● ● ● ● ● ● ● ●● ● ●●● ● ●● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ●●● ●● ● ● ● ● ● ● ●● ● ● ●●● ● ●●● ●● ● ● ● ●● ● ● ●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ●● ●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ●● ● ● ●● ●● ● ●● ● ● ●● ● ●● ● ● ● ●●●● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●●● ●● ● ●●● ● ● ● ● ●●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●●● ●●● ● ● ● ●● ● ●● ● ● ● ●● ● ● ●● ● ●●● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ●●●● ● ●● ● ●● ● ●● ● ● ●● ● ●● ●● ●● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ●● ● ● ●● ● ●● ● ● ●●● ●● ● ●●● ●●●● ●●●● ● ● ● ● ●● ●● ●● ●● ●●●●● ● ● ● ● ● ●● ● ●●●● ●●●● ●● ●●● ● ● ● ●● ● ●●

●●

●

0.1

Bootstrapped PER501 (Environment) Standard Errors

10.0

(c)

●

●

0.1

0.2

0.5

1.0

2.0

5.0

10.0

Analytical PER501 (Environment) Standard Errors

Figure 2: Comparing estimated error variances for Environment (PER501) and the CMP Left-Right scale (RILE). Axes use log scales.

39

10.0

(a) Environment (PER501)

2.0 1.0 0.5 0.2

●

0.1

S.E. with random N

5.0

● ● ● ● ● ● ●● ●● ● ●●● ● ●● ●●● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ●● ●●●●● ● ● ● ●● ●●●●●● ● ● ●● ● ● ●● ●●● ●● ●● ● ●● ● ● ● ● ● ● ●●●●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●● ●● ● ●● ● ● ●● ● ●●● ●●● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●●●●● ● ● ● ● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●●●● ● ● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ● ● ● ●●●●● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●●●● ● ● ●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ●● ●● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●●●● ● ●● ● ●● ●● ● ●●● ●● ●● ● ● ● ● ●● ● ● ●●● ●● ● ● ● ● ● ●●● ● ●● ● ● ●● ● ●● ●●● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●●● ● ●● ● ● ● ● ●

●

●

0.1

0.2

0.5

1.0

2.0

5.0

10.0

S.E. with fixed N

(b) RILE

10.0 5.0 2.0

●

●

● ● ● ●●● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●●●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ●● ● ●●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ●● ● ● ●●● ●●● ●● ● ● ● ●●● ●● ●● ● ●● ● ● ●● ● ● ● ●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ●● ●● ● ●● ● ● ● ● ●●● ● ●●●● ● ● ● ● ● ●● ● ●● ● ● ● ● ●●● ● ●● ●● ●● ●● ● ●● ● ● ● ● ● ●● ● ●

0.5

1.0

S.E. with random N

20.0

● ● ●

0.5

1.0

2.0

5.0

10.0

20.0

S.E. with fixed N

Figure 3: Comparing bootstrapped standard errors when n is fixed to when n is randomly determined, for Environment (PER501) and the CMP left-right scale (RILE). Axes use log scales.

40

25 20 15 10

● ●

●

5

● ● ●

●

● ●

● ●

●

●

●

−5

0

CMP Environment Dimension (PER501)

●

1950

1960

1970

1980

1990

2000

German CDU−CSU

Figure 4: Movement on environmental policy of German CDU-CSU over time. Movement of Dashed line is % environment with 95% CI; dotted line is the number of quasi-sentences per manifesto coded PER501.

41

(a) Germany 2002

CDU−CSU

●

SPD

●

FDP

●

Greens

●

PDS

●

−40

−30

−20

−10

0

10

20

30

●

CMP Right−Left Dimension (b) France 2002

FN

●

UDF

●

UMP

●

Verts

●

PS

●

PCF

●

−30

−20

−10

0●

10

20

30

40

CMP Right−Left Dimension Figure 5: Left-Right placement of German and French parties in 2002. Bars indicate 95% confidence intervals.

42

●

●

2.5

3.0

−1.5

●

−2.0

●

−4.5

−4.0

−3.5

−3.0

extshft

−2.5

●

●

0.0

0.5

1.0

1.5

2.0

(1 + λ)

Figure 6: SIMEX plot for Noncentrist policy shift in Table 2 in Adams et. al. (2006). λ is the amount of noise added to variance of error-prone covariate. (1 + λ) = 1 is the naive estimate with no extra noise added. Measurement error is added in the increments of .5 to establish a pattern, and then extrapolated back (quadratically) to the case of no measurement error (λ = −1).

43