Characterizing Measurement Error in Human Rights∗ Oona A. Hathaway† Yale Law School Daniel E. Ho‡ Yale Law School Department of Government, Harvard University First draft: August 30, 2004 This draft: August 30, 2004
Abstract
We illustrate a method for accounting for measurement error in human rights studies – an area of research plagued by difficulties of measuring concepts that cannot be directly observed. We focus on the widely used Purdue Political Terror Scales (PTS), which quantify political terror experienced in a country based on independent qualitative narrative reports compiled by the United States Department of State and Amnesty International. A simple Bayesian measurement model systematically incorporates these two independent codings and directly models the uncertainty of a latent measure of political terror. This reveals that attenuation bias due to lagged PTS estimates can be severe, leading conventional estimates to be conservatively biased by an absolute order of roughly two. Substantively, this means that explanatory variables such as democracy may have roughly twice the impact on human rights as currently believed. We conclude that measurement methods illustrated here hold much promise for addressing concerns about measurement error in empirical scholarship.
∗ We thank Kevin Quinn for many helpful conversations. Generous research support was provided in part by the Center for Basic Research in the Social Sciences, the Project on Justice, Welfare and Economics, and the Carnegie Scholars Program. † Associate Professor of Law, Yale Law School. Phone: 203-432-4825, Fax: 203-432-4871, Email:
[email protected], URL: http://www.law.yale.edu/outside/html/faculty/ohath/profile.htm. ‡ J.D. Candidate, Yale Law School; Ph.D., Department of Government, Harvard University. Phone 617-496-3798, Fax: 617-496-2254, Email:
[email protected], URL: www.people.fas.harvard.edu/˜ deho.
1
1
Introduction
In recent years, empirical work in international law and international relations has flourished. Human rights, in particular, has drawn the attention of a wide array of scholars, each of which has sought to quantify which country characteristics contribute to an understanding of better government practices. Yet these studies have all suffered from the fact that researchers cannot directly observe many of the theoretical concepts they wish to study, such as democracy, corruption, or human rights. As a consequence, scholars have relied on suboptimal methods of operationalizing and measuring such concepts. Measurement error looms large, threatening the validity of empirical investigations of social and legal theories. Whether researchers attempt to assess the effect of, for example, corruption on economic efficiency or democratization on human rights, the empirical test is only as credible as the measurement of the concept is accurate. Nonetheless, most studies tend to ignore the issue, assuming no measurement error, or relying on less than ideal methods of combining estimates of various indicators. We illustrate new methods of assessing and incorporating measurement error in empirical studies. We focus on a burgeoning research area where measurement error is acknowledged to be particularly acute: human rights. Extant empirical studies of human rights have either ignored measurement error or have acknowledged it but been unable to fully address it. Since human rights indicators often enter as explanatory covariates, estimates ignoring measurement error are thereby inconsistent. Some studies employ different indicators for human rights, running the analyses separately, often with different results for different indicators (e.g., Poe, Tate and Keith, 1999; Poe and Tate, 1994; Hathaway, 2002). Others combine the measures without directly accounting for the uncertainty inherent in doing so (e.g., Apodaca and Stohl, 1999). To the extent the problem has been addressed, it has been portrayed as largely an issue of poor data availability, which it in part is (Hathaway, 2003; Goodman and Jinks, 2003; Jabine and Claude, 1992; Cingranelli, 1988).
2
But to date virtually none of the studies that make up the growing body of scholarship has been devoted to measuring human rights has focused on the ways the problem can be addressed through more effective analysis of data.1 By contrast, the broader statistical community has made significant strides towards dealing with measurement error (Gustafson, 2004; Johnson and Albert, 1999; Treier and Jackman, 2003; Quinn, 2004; Carroll, Ruppert and Stefanski, 1995). These important developments allow researchers to obtain latent measures of human rights by explicitly modeling the measurement process, thereby providing a more unified framework of inference. Rather than avoiding the problem by running separate analyses on various measures of human rights (which may produce inconsistent results), arbitrarily combining estimates (which may ignore important parts of the data), or assuming perfect measurement, researchers can conduct an analysis that more accurately captures human rights practices of a country as a whole as well as the uncertainty in this assessment. To demonstrate how researchers can capitalize on these methods, we start with the popular Purdue Political Terror Scales (PTS). The PTS provide two separate measures of the political terror experienced in a country during a given year based on independent qualitative narrative reports, one produced by the United States Department of State and the other by Amnesty International. These data provide a particularly revealing case study because they involve an effort to measure the same underlying concept using two separate sources. Indeed, researchers quantified the two separate political terror scales using identical coding criteria. Using a Bayesian measurement model, we show that accounting for measurement error produces substantially stronger inferences on of the causes of human rights abuses. Specifically, we show that 1
One notable exception is Cingranelli and Richards (1999), which applies Mokken scale analysis to provide a mea-
sure of physical integrity rights. Our approach extends this work methodologically, by permitting the incorporation of estimation uncertainty of latent measures in the analysis model, and by not imposing a strong assumption of monotone homogeneity.
3
attenuation bias due to lagged PTS estimates can be severe, leading, on average, to point estimates that are conservatively biased by an absolute order of roughly two. Substantively, these results suggest that the marginal impact of determinants of human rights infringements are on average roughly twice as large as previous studies suggest. This indicates that researchers can gain more leverage by using measurement methods that reduce attenuation bias. The measurement methods outlined here thus bear much promise for unifying inference in human rights research – they point to one way that researchers can use multiple data sources to generate one latent unified measure of countries’ human rights practices that incorporates measurement error in the data sources. And this is not limited to human rights; the model holds similar promise for situations outside of human rights where researchers are also seeking to measure concepts that are difficult to observe directly.
2
Measuring Human Rights
Several decades ago, a small group of political scientists revolutionized the study of human rights by examining the causes of human rights abuses using empirical analyses of states’ practices (Carleton and Stohl, 1987; Henderson, 1993; Mitchell et al., 2002; Mitchell and McCormick, 1988; Poe and Tate, 1994; Stohl, 1975; Stohl, Carleton and Johnson, 1984). Only recently, however, has this approach made its way into legal scholarship, where it has generated substantial controversy (Hathaway, 2002, 2003; Goodman and Jinks, 2003). In the last year alone, panels on the issues surrounding empirical approaches to human rights have been held at the annual conferences of the American Society of International Law, the Law and Society Association, and now the American Political Science Association. Perhaps surprisingly, much of the debate among scholars has centered on the seemingly arcane issue of measurement error. This issue arises in human rights scholarship because the central variables of interest–countries’ human rights practices–are difficult to measure. These difficulties stem from both scarcity of
4
information and from inherent difficulties in measuring human suffering in quantitative terms. To begin with, information on countries’ human rights practices is variable. For some countries, there is a great deal of information and for some there is very little. Only a few organizations have made any significant attempt to gather the information across all countries over any significant span of time.2 But even when there is information, that information often does not come in a form that is easy to analyze. Hence researchers have spent a great deal of effort translating qualitative accounts of countries human right practices into quantitative data. Yet this process of translation necessarily inevitably introduces measurement errors of its own. More specifically, there are several sources of quantitative human rights data, each with its own approach to quantifying human rights, but all plagued by the two types of measurement error outlined above. The first and most widely used data is known as the Purdue Political Terror Scale. The Political Terror Scale was originally generated by Michael Stohl, whose study of the effect of war on domestic political violence was among the earliest empirical studies of human rights. Stohl found the data on human rights that was available unsatisfying and thus set about constructing an index of human rights practices that has been at the core of empirical human rights research ever since. The dataset he constructed, and that Mark Gibney has worked to update, has formed the foundation of the majority of significant studies of human rights practices. A more detailed account of the dataset is provided in Section 3 below. Yet while the dataset is remarkably comprehensive and undoubtedly the best available, it is not without its drawbacks. Most notably, the two scales 2
The four most prominent sources of comprehensive cross-national time series information on a broad spectrum
of human rights practices are the United States Department of State Country Reports on Human Rights, Human Rights Watch’s reports, Amnesty International’s Country Reports, and Freedom House’s Freedom in the World reports. Although these data are the best available, each has been the subject of critique. And while there are many other sources of data on human rights practices, most do not cover all or nearly all countries in the world over a substantial period of time, as is necessary for the cross-national time series analyses that researchers are currently pursuing.
5
(one based on the State Department reports and the other on the Amnesty International Reports) often differ significantly, suggesting that measurement error is an issue in at least one of the scales. A variety of other measures of human rights have also been used by political scientists and legal scholars. For example, many researchers use the Freedom House index of civil and political freedom to measure countries’ human rights records (e.g., Park, 1987; Simmons, 2002). Yet there has been some confusion as to whether these indices ought to be considered measures or determinants of human rights. Indeed, the index has been used by researchers both as a measure of the human right to civil and political freedom and as a measure of democracy (which is considered to be an important determinant of human rights practices) (Vanhanen, 2000, p. 252). Another source of human rights data are data on torture and fair trials, coded by Oona Hathaway from the United States Department of State Human Rights Reports (e.g., Hathaway, 2002). And the Center for International Development and Conflict Management at the University of Maryland, College Park has developed data on “genocide and politicide” (Harff, 2003; Harff and Gurr, 1998). Again, though, like the PTS, these data rely on indirect observations of the practices they seek to measure. Most other sources of information on countries’ human rights practices remains narrative in form (for example, Human Rights Watch produces reports on countries’ human rights practices, much like those produced by Amnesty International) and hence currently difficult to use in statistical analyses. In response to the lack of certainty in existing human rights measures, researchers have adopted a variety of methods for incorporating information from human rights data. Some present several different sets of results based on different indicators. Others simply present results from selected indicators, raising questions as to whether similar results would be obtained with different measures. Indeed, to the degree that the results of different analyses diverge, it remains unclear whether these divergences result from substantively different human rights indicators, measurement error,
6
different time period or countries examined, or some combination of all of these factors. To illustrate some of these challenges, Figure 1 presents six indicators that have been used to measure human rights– in clockwise order from the top left panel: genocide, torture, fair trial, civil liberty, the proportion of men in parliament, and the Amnesty International PTS. The white space indicates missing data, with darker shades indicating generally better human rights records. Each of these indicators spans substantially different time periods, so it may not be surprising that analyses employing different indicators should yield different results. In the following section, we present one way to unify inferences across these measurements of human rights. Although these techniques may be used with a wider range of indicators, we use the PTS data to illustrate the approach. We do so for four reasons. First, for expository clarity, the PTS data provide a transparent setting that illustrates both the benefits and costs of measurement methods. Incorporating other indicators presents an even more daunting problem of missing data across large fractions of country-years. Second, the PTS data form the core of existing empirical human rights research, and both sets of data are explicitly aimed at measuring a common underlying concept: political terror. Indeed, numerous studies have employed the two sets of PTS data interchangeably for this very reason. Finally, unlike in the case of the PTS, the five other human rights indicators mentioned above may not necessarily seek to measure the same underlying concept. The methodology presented here is not limited to the application demonstrated below. It may be used to analyze other concepts within the area of human rights–for example, it could be used to generate a measure of torture or fair trials, if multiple measures of those concepts were available. Or it could be used to generate a single unified measure of human rights that is broader even than the PTS by using multiple indicators to generate a single measure. And, of course, the methodology can be used outside the field of human rights to address similar measurement difficulties in other
7
Genocide
Torture
Fair trial
1954
1958
1962
1966
1970
1974
1978
1982
1986
1990
1994
1998
2002
Country
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
Country
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
Country
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
1954
1958
1962
1966
1970
1974
Year
1978
1982
1986
1990
1994
1998
Civil liberty
1966
1970
1974
1978 Year
1962
1966
1970
1974
1982
1986
1990
1994
1998
2002
1978
1982
1986
1990
1994
1998
2002
1986
1990
1994
1998
2002
PTS Amnesty
Country
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
Country
Country
1962
1958
Year
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
1958
1954
Men in parliament
Zimbabwe Zambia Yugoslavia Yemen, South Yemen, North Yemen Vietnam, South Vietnam, North Vietnam Venezuela Uzbekistan USSR Uruguay United States United Kingdom Ukraine Uganda UAE Turkmenistan Turkey Tunisia Tonga Togo Thailand Tanzania Tajikistan Syria Switzerland Sweden Swaziland Sudan Sri Lanka Spain South Africa Somalia Slovenia Slovakia Singapore Sierra Leone Serbia and Montenegro Senegal Saudi Arabia Rwanda Russia Romania Qatar Portugal Poland Philippines Peru Paraguay Papua New Guinea Panama Pakistan Oman Norway Nigeria Niger Nicaragua New Zealand Netherlands Nepal Namibia Myanmar (Burma) Muanmar (Burma) Mozambique Morocco Mongolia Moldova Mexico Mauritius Mauritania Mali Malaysia Malawi Madagascar Macedonia Luxembourg Lithuania Libya Liberia Lesotho Lebanon Latvia Laos Kyrgyzstan Kuwait Korea South Korea North Kenya Kazakstan Kazakhstan Jordan Japan Jamaica Ivory Coast Italy Israel Ireland Iraq Iran Indonesia India Iceland Hungary Honduras Haiti Guyana Guinea−Bissau Guinea Guatemala Greece Ghana Germany, West Germany, East Germany Georgia Gambia Gabon France Finland Fiji Ethiopia Estonia Eritrea Equatorial Guinea El Salvador Egypt Ecuador Dominican Rep Djibouti Denmark Czechoslovakia Czech Republic Cyprus Cuba Croatia Costa Rica Congo Kinshasa Congo Brazzaville Comoros Colombia China Chile Chad Cen African Rep Canada Cameroon Cambodia Burundi Burkina Faso Bulgaria Brazil Botswana Bosnia Bolivia Bhutan Benin Belgium Belarus Bangladesh Bahrain Azerbaijan Austria Australia Armenia Argentina Angola Algeria Albania Afghanistan
1954
2002
Year
1954
1958
1962
1966
1970
1974
1978 Year
1982
1986
1990
1994
1998
2002
1954
1958
1962
1966
1970
1974
1978
1982
Year
Figure 1: Common Indicators of Human Rights. From the top left panel, in clockwise order, (a) death magnitude (1954 - 2001), (b) torture (1985 - 1999), (c) fair trial (1985 - 2000), (d) civil liberty (19722002), (e) proportion of women in parliament (1960 8- 2003), (f) Amnesty PTS (1980 - 1996). For further descriptions see (Hathaway, 2002). White space indicates missing data; darker shades indicate generally better human rights records.
Level 1
2
3
4
5
Criteria Countries under a secure rule of law, people are not imprisoned for their view, and torture is rare or exceptional. Political murders are extremely rare. There is a limited amount of imprisonment for nonviolent political activity. However, few persons are affected, torture and beatings are exceptional. Political murder is rare. There is extensive political imprisonment, or a recent history of such imprisonment. Execution or other political murders and brutality may be common. Unlimited detention, with or without a trial, for political views is accepted. The practices of level 3 are expanded to larger numbers. Murders, disappearances, and torture are a common part of life. In spite of its generality, on this level terror affects those who interest themselves in politics or ideas. The terrors of level 4 have been expanded to the whole population. The leaders of these societies place no limits on the means or thoroughness with which they pursue personal or ideological goals.
Table 1: Coding criteria for PTS scores applied to Amnesty International and State Department Country Reports.
areas of study (see, e.g., Treier and Jackman, 2003; Quinn, Hechter and Wibbels, 2003). Hence, although we illustrate the methods in this one simple setting, we hope that future research will explore other applications.
3
The Political Terror Scales
The PTS dataset3 was coded in an academic project using annual Amnesty International and United States State Department Country Reports on Human Rights (e.g., Carleton and Stohl, 1987; Levinson, 2003; Zanger, 2000). The scale attempts to measure the degree of arbitrary physical harm and coercion by the government. Table 1 presents coding criteria used to code country human rights reports by the US State Department and Amnesty International. Figure 2 plots the measures from 1980–87, presenting representative countries in each level. For example, for all coded years, Canada exhibited the lowest political terror rating, whereas Afghanistan (1980, 1982–87) exhibited the highest level of political terror, reflecting conditions 3
The PTS data is available at http://www.unca.edu/politicalscience/faculty-staff/gibney.html.
9
●
Argentina (1985)
●
●
●
5
●
●
● ● ●
●
● ●
●
●
● ●
● ●
●
●
●
●● ● ●
●
●
● ● ●
●
●
●
●
●
●● ●
● ●
●
●
Iran (1980,1982−87) Afghanistan (1980−83,1985−86) ●
●
●
●
4
●
● ●
● ● ●
●
● ●
●
● ●
●
● ● ●
●●
●
●
● ● ● ●● ● ●● ● ● ●
●●● ●● ● ●
●
● ● ● ● ●
●
●
●● ●
●
● ●●
●
● ●
●
● ● ● ● ● ●
●
● ●●
●
●
●
● ●
●
●
● ●
●
●
●
●
●
●● ● ● ● ● ●
● ●
State Department 3
Soviet Union (1980−87) Mexico(1980−82,1984−87) ●
●●
●
● ●
● ●
● ●
● ● ● ● ● ● ● ● ● ●
● ●
2
●
● ●
●
●
Canada (1980−87) Australia (1980−87)
1
●
●
● ●
● ● ● ● ● ● ● ●● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●● ●●●● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ●● ●● ● ●●● ● ● ● ●●● ● ●●● ● ●●● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●● ●●●● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ●●● ● ●●●● ● ●● ● ●● ● ● ●● ● ●● ● ● ● ● ●●● ● ●● ● ●
1
● ● ● ● ●
●
●
●
●
● ●
●●
● ●
●● ● ●● ●● ●● ● ● ●● ● ●● ● ● ● ● ●●● ●● ● ● ● ● ● ●●●● ● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●●● ●● ● ●● ● ● ● ● ● ●●● ● ●● ● ●● ● ● ● ●● ●● ● ● ● ● ● ● ● ●●● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ● ●● ●● ● ● ●● ● ● ●● ● ●● ●● ●● ●● ●● ● ● ● ●● ● ●●● ● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ●●●● ● ●
● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●● ●●● ● ● ● ● ●● ● ● ●
● ●
●
●
●
● ●● ● ●● ●● ●●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ●● ●● ● ●●● ● ● ● ● ● ● ●● ● ●● ●● ●● ●● ● ● ● ●● ● ●●● ●● ●● ●●●● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ●● ●● ● ●●● ● ● ● ●● ●●● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ●● ●
● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ●● ● ●● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●● ● ●●● ● ● ●● ●● ● ● ●● ● ● ●● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●● ● ● ●●
● ●● ● ●
●
●
●
● ● ●
● ● ●
● ●
● ● ●
●
●
●
● ●
●
●
●
●
● ●
●
2
● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ● ●● ● ●● ● ● ● ●● ●●
● ● ● ●
●
3 Amnesty International
Guinea (1983) ●
4
5
Figure 2: Correlation of political terror scale measures as coded by the US State Department and Amnesty International. All measures are ordinal integers from 1 to 5, with lower measures indicating fewer human rights abuses. The measures are jittered for visualization. Although the scales are highly correlated, substantial deviations exist. during the Soviet-Afghan war. While the data based on the Amnesty and State Department reports are generally correlated (the correlation coefficient is 0.83), political terror does not appear to be measured without error. While Argentina is coded as having the highest level of political terror (5) in 1985 by the State Department, Amnesty rates it as the second lowest level (2). Conversely, Amnesty rates Guinea with the lowest rating for political terror (1) in 2003, whereas the State Department coding (3) suggests that murders, disappearances, and torture are a common part of life. The two reports agree roughly 63% of the time, a rate of agreement that remains fairly constant from 1980-1987. An examination of the last decade of reports by Amnesty and the State Department elucidate one possible reason for differences in the codings: State Department reports tend to be substantially more detailed than those provided by Amnesty. State reports tend to run roughly thirty pages, compared to no more than a few pages by Amnesty, and thereby cover events 10
in much greater detail than does Amnesty. Yet despite the difference in length, for 29% of the observations, ratings based on the Amnesty reports indicate that human rights records are worse than ratings based on the State Department reports.
4
An Ordinal Factor Analysis Measurement Model
The approach we take here follows Treier and Jackman (2003) and is similar in spirit to multiple imputation of missing data (Rubin, 1987; King et al., 2001).4 Specifically, we treat human rights as a latent (i.e., not directly observable) variable, and employ an ordinal factor analysis model to use the Amnesty and State scores to impute this missing data. The intuition of this is that we compute the principal component of these scores to extract the latent measure of political terror. The primary advantage of the Bayesian approach that we adopt here, compared to previous approaches in human rights (e.g., Cingranelli and Richards, 1999), is that it permits the incorporation of estimation uncertainty of latent measures in the analysis model. More formally, let i = {1, ..., N } index countries, t = {1, ..., T } index years, and j = {1, 2} index Amnesty and State PTS response variables for each country year. Each observed measure yitj is ordinal with 5 categories. Y is generated by an ordered probit process with latent variable N × 2 matrix Y ∗ and cutpoints γ: ∗ yit = Λφit + it
it ∼ N (0, I) where Λ is the k × 2 matrix of factor loadings, and φit is a vector of length 2 of factor scores, where the first element of φit = 1. The probability that the jth measure of human rights for country i in 4
For generalizations to the continuous and mixed case, see Quinn (2004).
11
year t equals c is: πitjc = Φ(γjc − Λ0j φit ) − Φ(γj(c−1) − Λ0j φit ). The model is identified by constraining Λ12 to be positive, setting γ1 = 0, and by conjugate priors Λitj ∼ N (0, 10) and φit(2) ∼ N (0, 1). We draw 1,000,000 samples from the joint posterior distribution using MCMC, with a burn-in period of 10,000 simulations and a thinning interval of 1,000.5 Posterior diagnostics, monitoring autocorrelation and trace plots, indicate convergence.
5
Results
5.1
Latent Human Rights Scores
Figure 3 presents the estimates of the latent measure of human rights for 1985, where higher values indicate higher human rights scores. Given the simple two-dimensional setting, there is a ready interpretation of these scores. Specifically, they represent the first principal component of the PTS data:6 if we simply rotated the x-axis of Figure 2 counter-clockwise by 45 degrees, each score represents the country-year’s location on the new x-axis along roughly 9 unique levels, where orthogonal components are ignored for calculating this dimension. Given this simple interpretation, one might readily ask what the benefits of a measurement model are. We see the benefits as twofold. First, the measurement model explicitly takes into account the uncertainty of the latent factors.7 Second, the measurement model quantifies mismeasurement that is not orthogonal to the principal component. The latter reason is why in Figure 3, Argentina, for example, has a much longer posterior interval that spans across two latent clusters of countries in 1985, reflecting the 5 6
This model is fit via R’s MCMCpack (version 0.4-8) written by Andrew D. Martin and Kevin M. Quinn. √ √ The unit eigenvector is roughly { 0.5, 0.5}, most measures are akin to taking the average of Amnesty and
State measures. 7
To be sure, one might similarly imagine other ways in which principal component analysis could take into account
estimation uncertainty of the principal component.
12
greater uncertainty from the Amnesty and State reports. The confidence bands also permit ready assessment of country progress across time in human rights. Bolivia and Argentina, for example, appeared to improve human rights treatment for much of the 1980s, whereas South Africa, Suriname and Sri Lanka experienced sharp declines in human rights treatment. This measurement approach thereby permits human rights scholars to unify inferences. Rather than running separate analyses for multiple indicators each measuring with error a single underlying variable, researchers can leverage multiple sources to explicitly model measurement error.
5.2
Impact on Substantive Analyses
To see why this model is an important addition to the array of tools available to scholars, we examine how the model proposed above for accounting for measurement error affects existing empirical analyses of human rights. In particular, we focus in this Section on the effect of measurement error on inferences using data from Poe and Tate (1994), which remains even a decade after its publication one of the foremost studies in the field.8 We condition on the results of the measurement model, using the marginal posterior distribution of latent factor scores p(φ|Y ) as multiple imputations in the analysis model. This is akin to treating 8
The Poe & Tate data is available at http://www.psci.unt.edu/ihrsc/poetate.htm. Poe & Tate “corrected”
missing values by simply substituting State Department data for missing values for Amnesty indicators (Poe and Tate, 1994, p. 855). While it is unclear exactly how many values were imputed in this fashion, Poe & Tate report that on average State covered 151 countries per year, compared to 132 for Amnesty, suggesting that just over 150 values were imputed this way. Multiple imputation, rather than hot-deck imputation, of these values would be a more principled way to address these missing scores (see Little and Rubin, 1987). For simplicity and expository purposes, we do not correct the dataset in this way. Not taking account of the uncertainty in these missing values should if anything underestimate the effect measurement error. We also do not address to the other assumptions imposed in the original analysis. To render a causal interpretation to these analysis requires additional assumptions of homogeneity of treatment, independence, and functional form that may be questionable in this study (see Ho et al., 2004).
13
1980 3 2 1 0 −1 −2 −3
Afganistan
1984
1980
Albania
1984
Algeria
1980
Angola
Argentina ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
Benin ●
●
Bolivia
●
●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
Denmark ●
●
●
Bulgaria
Chile
China
●
● ●
● ●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Burkina faso
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
Ecuador
●
●
●
●
Gabon
●
●
●
Congo
●
● ●
●
●
●
●
Gambia
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Germany east
●
●
●
●
●
●
●
●
●
●
●
●
Czechoslovakia
3 2 1 0 −1 −2 −3
●
●
● ●
●
●
●
●
●
●
Equatorial guinea
Ethiopia
Fiji ●
●
●
●
● ●
Finland ●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●
France
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
Germany west
●
●
●
El salvador
●
●
●
Cyprus ●
●
● ● ●
●
Cuba
●
● ●
●
●
●
●
● ●
●
●
Canada
● ●
Costa rica
●
● ●
●
●
●
●
●
Egypt
● ●
●
●
Cameroon
● ●
●
●
●
Ghana
●
●
●
●
●
●
●
Greece
●
Guatemala
●
●
●
●
●
●
Guinea
Guinea−bissau
Guyana
Haiti
3 2 1 0 −1 −2 −3
● ●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●
●
● ● ●
●
Honduras
Hungary
Iceland ●
●
●
●
●
India ●
●
Indonesia
●
●
Iran
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
Iraq
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
Kampuchea
●
Luxembourg ●
●
●
●
●
Madagascar
● ●
●
● ● ●
Malawi
●
●
●
●
●
●
●
●
●
●
●
Nepal
Netherlands ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
Nigeria
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
Portugal ●
●
●
●
●
● ●
Singapore
●
●
●
● ●
●
Solomons ●
●
●
●
●
●
●
●
●
●
●
●
Somalia ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
Syria
Tanzania
●
●
●
●
●
●
●
●
●
● ●
●
United states ●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
● ● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
Senegal
●
●
Sierra leone
●
●
●
●
●
● ● ●
Sri lanka
● ●
●
● ●
Sudan
Suriname
Swaziland
●
●
●
●
●
●
●
●
● ●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
Sweden
●
●
●
●
●
●
●
● ●
●
●
Trinidad and tobago ●
●
●
●
●
●
●
●
●
●
●
●
Tunisia
Turkey
Uganda
United arab emiratesUnited kingdom ●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
● ●
●
Yemen (aden−south) Yemen (sana−north) Yugoslavia/slovenia
●
● ● ●
●
● ●
●
●
●
● ●
● ●
●
●
● ●
●
●
●
●
●
1984
1980
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
Zaire
●
●
●
Zambia
1984
●
● ●
●
●
●
● ●
●
●
1980
3 2 1 0 −1 −2 −3
●
●
● ●
●
●
●
●
●
●
Zimbabwe
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
Vietnam
●
Papua new guinea
3 2 1 0 −1 −2 −3
●
●
●
Venezuela
●
● ●
Panama
Saudi arabia
●
Uruguay
●
●
●
●
1984
●
●
●
●
●
1980
●
Pakistan
Rwanda
●
●
Togo
●
●
●
●
●
● ●
●
●
● ●
●
●
●
Mozambique
●
●
●
Thailand
●
●
●
●
●
●
●
●
●
●
●
Oman
●
●
●
● ●
Morocco
●
Spain
●
● ●
● ●
● ●
Mexico ●
●
●
●
●
● ●
●
● ●
●
●
●
South africa Soviet union(russia)
●
●
●
●
●
●
Switzerland
●
●
●
Romania
●
●
●
●
●
●
●
● ●
●
Ivory coast(cote d'voire 3 2 ● ● ● ● 1 ● ● ● ● ● ● ● ● ● 0 ● −1 −2 −3 Liberia Libya
●
● ●
●
●
● ●
●
●
●
● ●
Qatar ●
● ●
●
Norway ●
●
● ●
●
●
●
●
●
●
●
●
● ●
●
Poland
●
●
Mauritius
●
●
●
●
●
●
● ●
Philippines
●
●
●
● ●
●
Lesotho
●
Mauritania ●
●
●
●
●
●
● ●
●
Laos
●
●
●
●
●
Malta
Niger
●
●
●
●
●
●
●
● ●
●
Italy
●
●
● ●
●
●
Peru
●
●
●
Kuwait ●
●
●
Israel ●
●
Nicaragua
●
Paraguay
●
●
●
●
●
● ●
New zealand ●
●
● ●
●
●
●
●
●
Mali
●
● ●
● ●
●
Malaysia
● ●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
Korea south
●
●
●
Kenya
●
●
●
● ●
●
●
Jordan
●
●
●
●
● ●
Japan
●
●
● ● ●
●
●
● ●
●
●
Ireland ●
●
●
Jamaica
●
●
●
●
● ● ●
●
3 2 1 0 −1 −2 −3
●
●
●
●
●
Burundi
● ●
●
●
●
3 2 1 0 −1 −2 −3
●
● ●
●
●
●
3 2 1 0 −1 −2 −3
●
●
Burma
● ●
●
●
Comoros
●
●
●
3 2 1 0 −1 −2 −3
Barbados
●
● ●
●
Columbia
●
Dominican republic
● ●
●
●
●
● ●
●
●
Djibouti ●
●
Human Rights Rating φ
●
1984
Bangladesh
●
●
● ●
●
3 2 1 0 −1 −2 −3
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Cape verde Central african republic
●
●
●
●
●
●
●
Brazil
●
● ●
●
●
1980 Bahrain
●
●
Botswana ●
●
●
●
●
● ●
●
●
●
3 2 1 0 −1 −2 −3
●
1984 Bahamas
●
Belgium ●
●
●
● ●
●
1980 Austria
● ●
●
1984 Australia
●
●
●
●
●
●
●
●
●
● ●
●
● ●
● ●
1980
1984
●
●
1980
●
●
1984
Year
Figure 3: Estimates of latent human rights rating from 1980 to 1987, with lower values of φ indicating better human rights records. White lines represent posterior means and gray bands indicate 90% posterior intervals. 14
3 2 1 0 −1 −2 −3
the human rights measure as missing data and the latent scores of the measurement model as fully imputed datasets.9 This simplifies analysis tremendously. We keep the exact same specification as in Poe and Tate (1994), modeling human rights by ordinary least squares with a lagged variable:
φit = β0 + φit−1 β1 + Xβ2:k + νit νit ∼ N (0, σ 2 ) where X represents observed covariates and θ = {β, σ 2 } represents the vector of parameters from the analysis model. We evaluate the following integral by multiple imputation: Z p(θ|X, Y ) ∝
p(θ|X, φ)p(φ|Y )dφ.
˜ we obtain one draw from the posterior distribution Concretely, for t = {1, ..., 1000} values of φ, of parameters θ. We also investigated inferences with a much lower number of imputations, using standard combination rules of multiply-imputed datasets (Little and Rubin, 1987; King et al., 2001). Inferences were substantially similar for only five datasets. This suggests a simple solution for applied researchers to incorporate the measurement model. Simply run all analyses as previously, with the only difference being the use of multiply imputed datasets. To illustrate the effect of measurement error, Table 2 presents quantile statistics from the marginal posterior distribution accounting for measurement error and assuming no measurement error. While the left columns present estimates using multiple imputations of the latent human rights score, the right columns use only posterior means, akin to using only point estimates from the principal component. Assuming no measurement error, the model yields inferences that are 9
A unified analysis would evaluate the joint posterior of all parameters, rather than separating the imputation
step (see, e.g., Quinn, Hechter and Wibbels, 2003). The current illustration may be considered a case of multiple imputation inference with “uncongenial data,” since the imputation step did not employ all the information of the analysis model (see Meng, 1994). Recognizing the drawbacks to this approach, we plan to compare this second-best approach to one that samples from the joint posterior next.
15
Intercept Lagged(Y) Democracy Population size Population change Economic standing % economic change Leftist government Military control British influence International war Civil war
Accounting for Measurement Error 5% 50% 95% 0.360 0.695 1.026 0.514 0.573 0.633 0.043 0.067 0.091 -0.083 -0.063 -0.044 -0.033 -0.011 0.010 0.009 0.018 0.026 -0.002 -0.000 0.002 -0.087 0.015 0.116 -0.048 0.035 0.121 -0.013 0.062 0.136 -0.427 -0.277 -0.139 -0.520 -0.392 -0.268
Assuming no Measurement Error 5% 50% 95% 0.158 0.376 0.580 0.746 0.775 0.809 0.022 0.037 0.051 -0.047 -0.034 -0.020 -0.022 -0.007 0.006 0.003 0.008 0.013 -0.001 0.000 0.001 -0.059 0.004 0.067 -0.037 0.015 0.065 -0.022 0.022 0.066 -0.250 -0.159 -0.074 -0.302 -0.230 -0.158
p-value 0.085 0.000 0.040 0.023 0.346 0.044 0.331 0.342 0.334 0.219 0.117 0.032
Table 2: Quantiles for marginal posterior distribution of main effects for Poe & Tate data. Assuming no measurement error indicates estimates from OLS using only the mean of the latent human rights measure, and accounting for measurement error indicates OLS estimates that sample from the posterior distribution of the human rights ordinal factor analysis model. Last column presents posterior p-values of whether accounting for measurement leads to a different substantive inference on the main effects, where P (|βM E | ≥ |βN M E |) and subscripts indicate coefficients accounting for measurement error (M E) or assuming no measurement error (N M E). For the lagged dependent variable, the posterior p-value represents the probability that the coefficient is lower accounting for measurement error.
substantively identical to Poe and Tate (1994). Yet relaxing the measurement error assumption yields a lagged coefficient estimate that is substantially lower (p-value=0.000). More importantly, each of the estimates from the substantive effects is inflated due to the decrease in attenuation bias. The last column of Table 2 presents posterior p-values of whether accounting for measurement leads to a different substantive inference on the main effects. The posterior probability that population has a larger effect by relaxing assumption of no measurement error is 0.98. Similarly, there is a roughly 0.96 posterior probability that the impact of GDP is higher than estimates from the model assuming no measurement error and a 0.97 posterior probability that the impact of civil war is bigger. Figure 4 plots the posterior densities of each of the parameter estimates, illustrating the presence of attenuation bias due to the lagged measure of human rights. In each instance, the posterior
16
1.0
0.5
0.6 0.7 Lagged(Y)
0.8
0.02
0.06 0.08 Democracy
0.10
120
0.04
−0.02
Density 20 40 60 80 −0.06
−0.04 −0.02 0.00 Population change
0.02
0.010 0.020 Economic standing
0.030
−0.002 0.000 0.002 % economic change
0.004
−0.10
0.00 0.05 0.10 British influence
0.20
2 −0.1
0.0 0.1 Leftist government
0.2
0.0 0.1 Military control
0.2
8
Density 4 0.15
0
0
0
2
2
2
4
−0.1
Density 4 6
6
Density 6 8 10
8
14
−0.006
0
0
0
2
100
4
Density 6 8
Density 4 6 8
500 Density 300
0.000
10 12
−0.08 −0.06 −0.04 Population size
10
−0.10
0
0
0
10
10
Density 20 30
Density 20 30 40
40
50
50
0.5 Intercept
0
0
10
5
Density 20 30
Density 10 15
40
20
3.0 Density 1.0 2.0 0.0 0.0
−0.5
−0.4
−0.3 −0.2 −0.1 International war
0.0
−0.6
−0.5
−0.4 −0.3 Civil war
−0.2
−0.1
Figure 4: Attenuation bias due to measurement error. This figure presents posterior marginal distributions of coefficient estimates assuming no measurement error (dashed lines) and relaxing the measurement error assumption (solid lines). Posterior 90% intervals for the no measurement error model are shaded, and denoted by black solid line at origin for the measurement error model. The vertical line represents the origin.
17
distribution of parameters shifts upward in absolute terms, as signified by the solid lines compared to the dotted lines. In other words, marginal effects are uniformly greater when accounting for measurement error. The last column of Table 2 calculates the probability that the substantive effect is greater in absolute terms. For six of the twelve parameters, the p-value is less than 0.1. Substantively, Poe and Tate (1994) are particularly interested in the marginal and cumulative effects of a change in democracy over time. We calculate the same quantities of interest here for both models assuming perfect measurement and accounting for measurement error. Assuming perfect measurement, one might estimate that an decrease in democracy from the highest to the lowest value leads to a 0.22 decrease in human rights, which is roughly equal to the decrease in human rights experienced by Nicaragua with contra forces from 1985 to 1986.10 Substantively, “if a democratic country with a near perfect human rights record were suddenly to abandon the democratic process, we would expect that the country would...begin to hold some political prisoners, and that political brutality, executions, and murderers might become a common feature of life” (Poe and Tate, 1994, p. 860). Accounting for measurement error, however, one would estimate that the same loss of democracy leads to a decrease of 0.4 in human rights, which is roughly the decrease in human rights that Pakistan experienced in successive years from 1980-1982 under martial law. In short, the marginal impact of democracy in one given year is almost doubled by accounting for measurement! Figure 5 presents a conditional contour plot of the difference in the marginal effect due to accounting for measurement error at each year if a country changed from being fully democratic in year 0 across time. The shading indicates the quantiles of the cumulative impact at each time period. This figure shows that the estimates accounting for measurement error are almost twice 10
This is the same as the original estimation by Poe and Tate (1994, p. 860-61) that the marginal effect in the
first year would be a 0.26 to 0.4 increase in the human rights index as scaled originally. Here, we report effects on the latent scale.
18
Cond. CDF
Difference in Cumulative Impact of Democracy Due to Measurement Error
Contour Plot of Difference in Marginal Effect 0.6
1.0
0.4
0.8
0.6 0.2
0.4 0.0 0.2
−0.2 0.0 5
10 Year
15
20
Figure 5: Contour Plot of Difference in Cumulative Effect Over Time of Democracy Due to Measurement Error. The shading (as annotated by the legend) indicates the CDF of the cumulative impact at each year. This figure shows that attenuation bias would lead to underestimation of the short-term marginal impact of democracy, but in the long-term the cumulative estimated effects are comparable due to the geometric impact of marginal effects via the lagged dependent variable. that of the conventional estimates in the first year, but that the cumulative difference in the impact is roughly the same. To understand the intuition here, note that while marginal impacts are smaller in the model assuming perfect measurement, the lagged dependent variable is also substantially larger. Since the marginal effects in any one year affect subsequent years geometrically through the lagged dependent variable, both models to converge at roughly the same cumulative effect over time. This convergence can be seen by the fact that the contour plot spikes in the first two years and then returns to the origin by year 10. In short, accounting for measurement error leads us to estimate larger immediate impacts, but the cumulative impact (notwithstanding the large extrapolations involved) would appear comparable. Given that extant studies appear to have underestimated the scale of immediate impacts of determinants of human rights, these results would militate for an even greater importance of short-term policies.
19
6
Concluding Remarks
By taking into account measurement error human rights scholars can produce more accurate and powerful statistical analyses of countries’ human rights practices. This has obvious value for those interested in human rights, as it permits more accurate assessments of how countries’ characteristics influence their human rights practices. To the extent that this information helps to inform policy choices, it is undoubtedly the case that better information will lead to more effective policies. Our study also holds promise for many other empirical investigations in social science and law where theoretical concepts cannot be directly observed. When researchers possess several sets of data that each imperfectly measure an underlying condition (as, here the PTS seeks to measure political terror in a country), this method of accounting for measurement error allows researchers to account for that measurement error in a principled way. Although disagreements over the conceptualization of human rights will no doubt continue to abound, the proposed statistical framework can at least address some of the shortcomings of existing studies. By allowing researchers to use multiple indicators of human rights to more effectively measure underlying concepts that are difficult to quantify, the method provides a means of addressing errors that those with even the largest research budgets cannot otherwise easily overcome.
20
References Apodaca, Claire and Michael Stohl. 1999. “United States Human Rights Policy and Foreign Assistance.” International Studies Quarterly 43:185–198. Carleton, David and Michael Stohl. 1987. “The Role of Human Rights in U.S. Foreign Assistance Policy.” American Journal of Political Science 31:10002–18. Carroll, R.J., D. Ruppert and L.A. Stefanski. 1995. Measurement Error in Nonlinear Models. London: Chapman & Hall / CRC. Cingranelli, David L. 1988. Human Rights: Theory and Measurement. New York, NY: Palgrave Macmillan. Cingranelli, David L. and David L. Richards. 1999. “Measuring the Level, Pattern, and Sequence of Government Respect for Physical Integrity Rights.” International Studies Quarterly 43:407–418. Goodman, Ryan and Derek Jinks. 2003. “Measuring the Effects of Human Rights Treaties.” European Journal of International Law 14:171–183. Gustafson, Paul. 2004. Measurement Error and Misclassification in Statistics and Epidemiology. New York, NY: Chapman & Hall. Harff, Barabara. 2003. “No Lessons Learned from the Holocaust? Assessing Risks of Genocide and Political Mass Murder since 1955.” American Political Science Review 97:57–87. Harff, Barabara and Ted Robert Gurr. 1998. “Systematic Early Warning of Humanitarian Emergencies.” Journal of Peace Research 35:551–79. Hathaway, Oona A. 2002. “Do Human Rights Treaties Make a Difference?” Yale Law Journal 111:1935–2042. Hathaway, Oona A. 2003. “Testing Conventional Wisdom.” European Journal of International Law 14:185–200. Henderson, Conway W. 1993. “Population Pressures and Political Repression.” Social Science Quarterly 74:322–333. Ho, Daniel E., Kosuke Imai, Gary King and Elizabeth A. Stuart. 2004. “Matching as Nonparametric Preprocessing for Improving Parametric Causal Inference.” Technical Report, available at http: // gking. harvard. edu/ files/ matchp. pdf . Jabine, Thomas B. and Richard P. Claude, eds. 1992. Human Rights and Statistics: Getting the Record Straight. Philadelphia, PA: University of Pennsylvania. Johnson, Valen and James Albert, eds. 1999. Ordinal Data Modeling. New York, NY: Springer. King, Gary, James Honaker, Anne Joseph and Kenneth Scheve. 2001. “Analyzing Incomplete Political Science Data: An Alternative Algorithm for Multiple Imputation.” American Political Science Review 95:49–69. Reprint at http://gking.harvard.edu/files/abs/evil-abs.shtml. Levinson, Sanford. 2003. “‘Precommitment’ and ‘Postcommitment’: The Ban On Torture In The Wake of September 11.” Texas Law Review 81:2013–53.
21
Little, Roderick J.A. and Donald B. Rubin. 1987. Statistical Analysis With Missing Data. New York: John Wiley & Sons. Meng, Xiao-Li. 1994. “Multiple-Imputation Inferences with Uncongenial Sources of Input.” Statistical Science 9:538–73. Mitchell, Christopher, Michael Stohl, David Carleton and George A. Lopez. 2002. “Chap. 1. State Terrorism: Issues of Concept and Measurement.” In Government Violence and Repression: An Agenda for Research, ed. Michael Stohl and George Lopez. Greenwood Press. Mitchell, Neil J. and James M. McCormick. 1988. “Economic and Political Explanations of Human Rights Violations.” World Politics 40:476–498. Park, Han S. 1987. “Correlates of Human Rights: Global Tendencies.” Human Rights Quarterly 9:405–13. Poe, Steven C. and Neal Tate. 1994. “Repression of the Human Right to Personal Integrity in the 1980s: A Global Analysis.” American Political Science Review 88:853–872. Poe, Steven C., Neal Tate and Linda Camp Keith. 1999. “Repression of the Human Right to Personal Integrity Revisited: A Global Cross-National Study Covering the Years 1976–1993.” International Studies Quarterly 43:291–313. Quinn, Kevin M. 2004. “Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.” Political Analysis, forthcoming. Quinn, Kevin M., Michael Hechter and Erik Wibbels. 2003. “Ethnicity, Insurgency, and Civil War Revisited.” Technical Report. Rubin, Donald B. 1987. Multiple Imputation for Nonresponse in Surveys. New York: John Wiley. Simmons, Beth A. 2002. “Why Commit? Explaining State Acceptance of International Human Rights Obligations.” Technical Report. Stohl, Michael. 1975. “War and Domestic Political Violence.” Journal of Conflict Resolution 19:379–416. Stohl, Michael, David Carleton and Steven E. Johnson. 1984. “Human Rights and US Foreign Assistance from Nixon to Carter.” Journal of Peace Research 21:215–226. Treier, Shawn and Simon Jackman. 2003. “Democracy as a Latent Variable.” Stanford University. Vanhanen, Tatu. 2000. “A New Dataset for Measuring Democracy, 1810-1998.” Journal of Peace Research 37:251–65. Zanger, Sabine C. 2000. “A Global Analysis of the Effect of Political Regime Changes on Life Integrity Violations, 1977–93.” Journal of Peace Research 37:213–33.
22