The Practice of Quantitative Methods

18 downloads 0 Views 144KB Size Report
Kelvyn Jones, School of Geographical Sciences, University of Bristol, UK. Summary ..... young or old is contingent and arbitrary (Allen, 1983). In this context ...
j:tmsrply 2-8-2010 p:201 c:0

23 The Practice of Quantitative Methods Kelvyn Jones, School of Geographical Sciences, University of Bristol, UK

Summary ( ( ( ( (

Quantitative methods in social science (QSS) Positivism Critical realism Closed and open systems Key theorists ( Chong Ho Yu ( Ann Oakley ( Wendy Olsen ( Ray Pawson

Key concepts Introduction This chapter is concerned with why we use quantitative methods in social science (QSS), what they do and how to use them well, and how this all relates to different philosophical positions. In teaching quantitative courses I am often confronted by barely concealed hostility. The approach is seen as hard, trivial, bloodless, reductionist, reactionary and even dead. I want to begin by confronting these views. ( Hard: Quantification is undoubtedly demanding. A pervasive problem is that this type of knowledge is highly cumulative and you cannot fully appreciate the more sophisticated approaches without knowing the simpler ones. This is exacerbated by focusing on technique and failing to get

over what Abelson (1995) has called ‘statistics as principled argument’. ( Trivial: There is undoubtedly mindless empiricism where techniques are just used because they can be. But questions of real importance can only be tackled with good quality extensive data and tools that can reveal pattern and guard against overinterpretation. To take poverty; how otherwise would you answer such questions as ‘has inequality increased in the last 50 years?’, or ‘is there a permanent underclass?’, or the causal question ‘does poverty produce failure or does failure cause poverty’? While ethnography provides rich knowledge about those in poverty, such work cannot inform on how extensive such findings are. ( Bloodless: Quantitative work stresses objectivity because experience has shown that we find pattern where none exists. I give my students a set of maps with cot deaths marked by a dot and they find clusters of birth defects in pure noise. A great deal of care is needed in making inferences; but that does not mean our work should not be impassioned. Indeed, removal of ignorance and false views is, I contend, genuinely emancipatory. ( Reductionist: Much quantitative work has indeed been over-generalizing – seeking the same results for all people for all time – and too atomistic in focusing on individuals and ignoring the context in which individuals find themselves. But this is changing as quantitative work is being developed that takes context seriously (Jones, Chapters 27 and 29, in this volume). 201

j:tmsrply 2-8-2010 p:202 c:0

PAR T VI

Q UA NTI TA TI VE M E THO DS: THE ORIE S A ND PE RSPE C TIV ES

( Reactionary: In part this related to the last point that much work has been atomistic and ignores context. But I believe that there is no necessary connection between political ideology and method. Thus, the Bell Curve’s1 argument that life outcomes are based on intelligence which can in part be genetically and racially inherited was based on quantitative analysis, but quantification was also used by the protagonists in the Bell Curve Debate (Russell and Glauberman, 1995) to argue that this was scientific racialism. Undoubtedly, you need a good understanding of quantitative social science to engage effectively in such debate and critique. ( Dead: Nothing could be farther from the truth. Subjects and approaches do wax and wane and currently Quantitative Social Science is seen by the UK government as a ‘strategically important and vulnerable subject’. But if you look more widely thousands of postgraduates are being trained in quantitative-orientated summer schools every year (e.g. at Essex, Ljubljana, Michigan), prestigious universities are setting up QSS institutions and the mainstream social science journals routinely publish a large range of quantitative studies. Indeed in Freakonomics (Levitt and Dubner, 2005) the approach has a best seller which is being made into a film!

( Explanation can only be based on observable and measurable events. This empiricist ontology aims to guard against metaphysical mysticism where explanation is based on authority – accepting what someone said without looking and measuring. Science as positivism focuses on data as value-free facts, where observation is performed through a one-way mirror; what is being measured is not changed by being observed. Observations are independent of any theoretical statements that might be subsequently constructed around them. At the extreme, beliefs, emotions and values are outside science, the purpose of which is to stick to what we can observe. ( Explanation as a regularity: One event causes another if it is regularly followed by it. Generalized knowledge is obtained by identifying such constant conjunctions or event regularities, hence the quantitative search for order and pattern as associations between variables. In this successionist epistemology, causation is replaced by universal regularity. This dispenses with any mysterious causal necessity or unobservable processes, like explanations based on God’s will. Scientists should seek ‘covering’ laws where events are seen as specific instances of a general law which is applicable universally for all time and all places. Thus, this shopper goes to this store because of the rational law of distance minimization.

Quantification and philosophies of science The philosophy of science is concerned with the underlying logic of the scientific method. It tries to make sense of what researchers do, as well as how to do things better. It consists of two parts: ontology – what exists – and epistemology – what can be known, and how can we know it? The answers are highly contested, with a number of competing ‘isms’. Much of the claim that QSS is defunct rests on the elision that science:positivism:quantification. As positivism is discredited, so is quantification. Here, I want to argue two things. First, that some of the tenets of positivism debar what is current state-of the-art procedures in QSS. Second, a contemporary philosophy of science, Critical Realism (CR), does see a role for quantification alongside other practices, and does so for more than simple counting or enumeration.

Positivism The many variants of positivism have two distinguishing features: 202

In practice, this approach has often degenerated (of necessity?) into instrumentalism whereby the world is treated as a black box without need to understand processes. The truth/falsity of theoretical statements is not the issue, for they are just regarded as computational devices; knowledge is judged true, because it is useful. Being able to predict is to be able to explain; and being able to predict allows control. To be able to ‘drive’ the system does not require understanding how the ‘engine’ works; and we have an instrumentalist rationality, whereby scientific thinking itself has become an ideology; the ends justify the means.

Critical realism Unlike positivism with its early nineteenth-century origins, critical realism has been initiated more recently by Roy Bhaskar (1975, 1979) and Rom Harré (Harré and Madden, 1975).2 It offers a radical

j:tmsrply 2-8-2010 p:203 c:0

23

alternative to positivism (the goal is not generalizable laws); to interpretivism (the goal is not solely to appreciate the lived experience of social actors); and to postmodernism (there are truths, and knowledge is more than just some undetermined socially-constructed linguistic system). A critical realist believes that there is an independent reality that science can study – each of us is not making it all up! It aims to develop deeper levels of explanation as causal necessity, not just regularity. It provides a logic of enquiry based on the fundamental formula: Mechanism;Context:Outcomes Mechanisms are not regularities but are potentially causal generative processes that operate in particular historical, local, or institutional contexts to produce particular patterns of outcomes. For me, CR provides a much more congenial home for the practice of quantitative research. Congenial because it makes sense of my practice and I do not have to pretend that I am doing something else. It provides a strong bastion against the radical unreason and relativism of the strong constructionist viewpoint where what is true depends entirely on where you stand; it explains why scientific practices work; it also puts sensible limits on what can be achieved. Critical realism has a multi-layered and stratified ontology. The world and our knowledge of it can be seen as three overlapping domains which of necessity must exist to permit the intelligibility and success of scientific practice. These domains have specific propensities: ( the empirical: aspects of reality that can be experienced and observed directly or indirectly; these experiences constitute parts of the ‘events’, which we can identify as the domain of ( the actual: aspects of reality that occur, but may not necessarily be experienced; these are in turn the outcomes of the domain of ( the real: ‘deep’ structures and mechanisms or tendencies that generate phenomena.3 The last are the key objects of knowledge in both natural and social science. In the social sciences, mechanisms are social practices which are outcomes of structures of social relations. Disputed entities such as class relations exist independently of us and are not simply human constructs; but unlike in positivism they may also be not directly observable, but still real

THE P RA C TIC E O F Q UA NTIT A T IVE M E THO DS

with observable outcomes. These structures are intransitive in operating independently of our knowledge of them (except when we intervene), but our knowledge of them is transitive and capable of being changed. Moreover, this transitive knowledge is not only a product of our fallible cognitive capacities, but also of the ideological pressures of the culture of any given scientific community. Mechanisms have the tendency to behave in such a way because of the structure of the underlying object. Thus a landlord–tenant structure necessitates the mechanism of the payment of rent. Such mechanisms are contingent and there may be countervailing tendencies in certain contexts which may prevent them from operating. Consequently, what causes something has nothing to do with the number of times it has happened; necessity implies neither regularity nor universality. Moreover, we can have emergent powers, new ways of operating due to the complex interplay of mechanisms. Water has the power to extinguish fire, a property which is not contained in the constituent parts of hydrogen and oxygen. Simply breaking down objects into their parts, reductionism, is therefore a fallible strategy.

Closed systems and the role of experiments The aim of CR science is to uncover these causal powers and structures. Natural science is greatly helped by having access to closed systems either through their natural occurrence (as in the near clockwork universe of planetary astronomy) or by creative intervention (in a machine or experiment). With closure it is much easier to see mechanisms as regularities because everything else is kept constant. Thus, both the regularity conception and instrumentalism are effective in such a system so that laws not only explain but predict such things as solar eclipses. To achieve this, two aspects of closure are required. First, the intrinsic condition requires that there is no change in the object possessing the causal powers. Second, the extrinsic condition requires that the relationship between the causal mechanism and external conditions remains constant. To achieve clockwork regularity and predict the hour requires that the spring must not suffer metal fatigue and the mechanism must be isolated from any tampering. CR clarifies that the purpose of a well designed experiment is to intervene to isolate a mechanism and trigger its outcome in a regular sequence: experimental production as well as experimental control. In a closed world we create the conditions for the formula 203

j:tmsrply 2-8-2010 p:204 c:0

PAR T VI

Q UA NTI TA TI VE M E THO DS: THE ORIE S A ND PE RSPE C TIV ES

Regularity:triggered Mechanism plus stable Context and thereby considerably aid the elucidation of explanatory mechanisms and structures.

this: the heterogeneity and complexity of what we are studying limits what can be achieved, and this is not just lack of skill or maturity of social science, but is constrained by the open nature of the social world. Returning now to the practice of QSS, I want to look at what we do in light of these ‘isms’.

Open systems and the need for methodological pluralism Some natural sciences – climatology and geology – have difficulty in securing closed systems, but still make great progress in understanding the world. They do so by inferring that mechanisms are operative in open environments, but may be hidden or counteracted by other powers. Thus geology uses knowledge of mechanisms to appreciate where oil may have been formed but needs to drill – empirical enquiry – to substantiate this; it is not knowledge of the causal mechanism that is incomplete, but lack of knowledge of accidental contingent conditions that are the hallmark of open systems. All science does this to some extent and transfers knowledge gained under artificial constructed environments to open systems. The social world is undoubtedly open for we are capable of conscious reflection and change (akin to cogs of a clock deciding to change the gearing ratio) and we have the capacity to re-configure the system (the fall of communism is akin to an acid-bath for the clock). While not expecting to find universal regularities within the social world, underlying causal structures may well give rise to differences or contrasts that are relatively enduring over space or time. Such patterns or ‘demi-regs’ (Lawson, 1989) may prove a good starting point for a CR investigation to uncover the mechanisms or constraints generating them. The multi-layered depth ontology and open systems that characterize social science requires different types of practice to obtain knowledge (Sayer, 1992). Methodological pluralism is a necessity, but this cannot be adopted unthinkingly. The concrete multifaceted objects of our observable world require rational theoretical abstraction to distinguish contingent from necessary relations, to identify structures and counter-factuals. Intensive work, including qualitative research, is required to see how mechanisms work out in particular cases; extensive work means looking at the demi-regs to see the evidence for processes in action and how widespread are phenomena. Finally, a synthesis is required to put it all together as explanation building; not just whether this causal process produces an effect but why, when, and how, and for whom. We need to be realistic about 204

Putting thoughts into quantitative practice The positivist blueprint of confining explanation to observables is not the way that quantification works. Usually we want to use measurements to infer beyond the immediate. After interviewing 5,000 households about their income we want to infer robustly about the people we have not measured to answer such questions as what proportion of the UK population are in poverty. Theory and a representative sample guarantee that we can make the inference to the unknown proportion with a known degree of confidence (Barnes and Lewin, in this volume). This inference to the directly unobservable is becoming a prominent part of quantitative work. Chapters 27 and 29 discuss multilevel models in which there are random effects. We posit that schools have a differential effect on student progress, and while we can only measure pupils, we can estimate the school differential effects. Such latent modelling is focused on the quantitative analysis of what cannot be directly measured. Another important area is causal inference. To know the causal effect of going to university on subsequent earnings, we need the outcome for the same individual who has and has not gone to university. Without both counterfactuals we cannot be sure that nothing else has affected the outcome, or operated as a confounder between earnings and learning. It is obviously impossible to observe both potential outcomes, one is always missing. Consequently, the effect of higher education cannot be measured but can only be estimated. The development of the potential-outcomes approach to causal inference (Morgan and Winship, 2007) does not concern itself with correlational associations and the prediction of future events from past events (as in the positivist recipe), but with a logical framework for thinking about causality and under what conditions valid inference can be made (Pearl, 2009), and how to find evidence that the ‘switching on’ of a process and the ‘holding off’ of others leads to a difference in outcome. Positivist tenets, this time the irrelevance of causality, is at odds with current practice. Such

j:tmsrply 2-8-2010 p:205 c:0

23

quantitative work (contra most CR accounts) thus appears to be useful in intensive work in determining how things are caused. I contend that the best quantitative work involves, often unknowingly, the practical realities that CR explains, and by understanding why, it enables us to do better. What follows is my list of rules of engagement for a post-positivist, CR-inspired QSS. Pay attention to abstraction and classification: Abstraction and classification is a key process and we must try to distinguish the necessary from the contingent. Thus, the landlord/tenant relation is a necessary one, but whether the actors in these roles are male/female young or old is contingent and arbitrary (Allen, 1983). In this context, theory is not grandiose speculation but rather mundane questions about what exists and what do social actors do, and to whom or with whom do they do it? This requires a more sophisticated social theory than either methodological individualism or totalizing structuralism and needs us to recognize that both agents and structures have causal powers. In Bhaskar’s (1979) transformational model of society, people as agents create and reproduce structures which in turn enable or constrain the actions of agents. Unfortunately, a lot of quantitative research, especially that based on official statistics and aggregate analysis, uses poorly abstracted taxonomic collectives which classify according to formal similarities and not functional connections, thereby conflating very heterogeneous groups. Just think of the ‘service industry’; hairdressers and bankers may be placed in that group but they have very different powers to act. No methodology, however sophisticated, can rescue a poor abstraction and classification. Practice retroduction but accept it is always fallible: This is the key epistemological method of CR in open systems; it involves guessing what are the underlying causal mechanisms that are operating. It is the form of reasoning used by detectives; here is the partial evidence – who could have caused the crime? Given a fallible hunch, what evidence do we need to corroborate it? In data analysis, we have to move from some observed phenomenon (or its absence) to posit some underlying mechanisms via a trial explanation which we call a model. Our job is to identify and to make sense of demi-regs as evidence that relatively enduring, and potentially identifiable mechanisms have been operating. Consequently, the technique of regression modelling (Jones, Chapter 27, in this volume) can be seen as an attempt to identify

THE P RA C TIC E O F Q UA NTIT A T IVE M E THO DS

spontaneously occurring closures which may give us clues to processes (Ron, 2002). Demi-regs must be seen as the beginnings of causal explanation not as the end point. There is a need to open up the black box, to understand outcome patterns rather than seek outcome regularities, and to be sensitive to the context in which the patterns have been found. Regularities, even enduring ones, do not explain themselves. Go back and forth between theory and the empirical patterns suggested by the data: The more extreme positivist would regard this as cheating – you are supposed to come up with a theory (without looking at data) which is then confronted and either confirmed or falsified. It is simply impossible to implement in open systems, which require an iterative process of understanding the data and developing constructs, and not a one-off calibration. In data analysis you need to try out models on the basis of a vague idea and discard them when they yield no explanatory return. This rarely succeeds on the first try and various models are needed to establish a good demonstration. Any account of what happens in reality must be able to account for this active role of the scientist in the process. The running of many models should not be regarded as ‘sinning in the basement’ (Kennedy, 2002) which must be hidden away; indeed it is both licensed and required by retroduction. At the same time there can be no pretence that the resultant estimates are the only ones that have been fitted and represent universal regularities across time and space that perfectly conform to theoretical expectation. Choose appropriate techniques for the task in hand: In the context of all this philosophically-inspired advice, it is easy to forget that we do need technique. One of the great all-time statisticians, John Tukey (1962) distinguished between exploration and confirmation; the former brings data into sharper focus to see patterns and anomalies; the latter is the use of significance testing to confirm well-developed hypotheses and avoid unreliable results. The specific technique matters too. If we take the arithmetic mean (Lewin, in this volume), Canadians earn $2,000 dollars less than Americans, which may suggest a policy of lower taxes and cutbacks is needed. But if you calculate what the average Canadian earns, the median is $2,000 higher than in America, because the US is a much more unequal society with the arithmetic mean being pulled upwards by relatively few, big earners – different techniques, different answers, different implications. 205

j:tmsrply 2-8-2010 p:206 c:0

PAR T VI

Q UA NTI TA TI VE M E THO DS: THE ORIE S A ND PE RSPE C TIV ES

Play with your data and use the tools of exploratory data analysis: EDA is close in spirit to retroduction; indeed Tukey saw it as numerical detective work in contrast to the judgemental confirmatory approach where the hypothesis is on trial. EDA is an attitude which encourages and licenses an iterative approach. It is based on the notion that ‘better a good answer to a vague question than a precise answer to the wrong one’ and ‘by assuming less you learn more’ (Jones and Almond, 1992). It has encouraged the development of procedures that reveal patterns in the data, and diagnostic, often graphical tools for exposing where assumptions are not met. In particular, residual analysis, the information that has not been accounted for by a model, has been likened to the magnifying glass of the story-book detective for bringing into sharper focus where the explanation is not working. However, we are not being data driven because we are exploring in the light of theoretically and substantively interesting questions as we investigate multiple working hypotheses. Be alert for outliers and think what may be causing them: Outliers, cases which are ‘far away’ from the rest, are often thrown up by data analysis. If they are not just mistakes, they offer an opportunity to learn that the current model is not working and they may give clues to new mechanisms. These potentially explainable anomalies are known as contrastive demi-regularities. We may learn about the normal by considering the abnormal, and treating the outlier as a critical case. In data analysis, residual, post modelling, graphical analysis can be a powerful tool for their identification (Cox and Jones, 1981). Appreciate the limited role of statistical hypothesis testing: The purpose of confirmatory approaches (Barnes and Lewin, in this volume) is to guard against chance results being interpreted as genuine pattern. Naive falsification – the testing of a null hypothesis – is unsupportable in open systems as the absence of an effect may be due to some other process preventing it. A significant p value really tells you that your sample size was large enough to detect an effect. Moreover, statistical significance says nothing about the magnitude and importance of the effect, and cannot be viewed as a substitute for judgemental assessments of the theoretical and practical significance of a particular model. Do not indulge in data dredging: This activity tries to find ‘rules’ that link variables by maximizing the goodness206

of-fit between the model and the observed data. This is usually achieved by machine-based algorithms using automated significance testing procedures. There is no contradiction between encouraging playing with data and denouncing dredging, for the former is retroduction and the latter is induction, seeking regularity as a black box without concern for illuminating causal mechanisms. Goodness-of-fit is a poor criterion for choosing one model over another; you can get perfect agreement by putting in the same number of variables as observations but you would have explained nothing. Such an approach can be likened to the Texan sharpshooter who fires at the barn door and then draws circles around where the bullets have clustered, and announces success. The explanations are being found solely in the results. Such unbridled empiricism is likely to capitalize on chance, finding pattern where none exists. Blind number crunching can be dangerous; you may have identified a black-box rule for profitable sub-prime lending but the market could change (breaking extrinsic closure) rendering the rule useless, with dire consequences. Be very wary of predictions: Quantitative forecasts are based on equations capturing enduring relations of the past that continue into the future. In open systems this will not be the case, and no predictive system is able to deal with abrupt breaks of regime. Sherden’s (1998) provocatively titled book examined the success of 16 types of forecasts. Only one-day-ahead weather forecasts and the aging of the population were more reliable than chance. A real danger is when fallible predictions are treated normatively, that is as goals that have to be achieved. This can degenerate into unthinking engineering to preserve the status quo; any systemic inequalities embedded in the equations are thereby reproduced into the future. Use deduction but know its limitations: Deduction in QSS means using mathematics to represent the world to come up with potentially unexpected results. Thus, in the birthday problem, we can deduce from theory that we only need 23 people in a room to have a greater than evens chance that two people share the same birthday; a smaller number than most expect. Exploiting the power of mathematics requires making ruthless abstraction where the world is stripped back to just a few key terms. This can potentially bring new knowledge. For example, the spread of an epidemic can be reduced to the reproduction ratio of how many people a single person can infect and this allows the effective and realistic assessment of alternative

j:tmsrply 2-8-2010 p:207 c:0

23

futures, even on a global scale (Colizza et al., 2007). Recent years have seen the development of complexity social science (Byrne, 1998) with its feedback and non-linear dynamic systems, but the realism of the assumptions in relation to human social action remains a key issue for any deductive reasoning. Use experimentation in all its forms: At the outset of a study, undertake a thought experiment as if you had unlimited resources and there were no ethical constraints. This ideal experiment will often allow you to formulate precise causal questions, appreciate what you need to control to get isolation, and will help identify a strategy for doing this in practice. If a randomized trial is possible, we can intervene and randomly allocate some individuals to receive the potentially causal exposure. This ensures that on average the exposed and non-exposed will be ‘balanced’ on all possible other influences. This state of equipoise holds off any other causal mechanism even if we do not know what they are! The desirability of the randomized experiment comes from this ability to deal with unmeasured confounders (the ‘unknown unknowns’) that could really be behind the apparent regularity between a causal exposure and an outcome. In recent years there has been an upsurge of interest in experimental social science. To take a single example, Oakley (1990) in her randomized study of social support for young mothers revealed that the standard practice of midwifes involved discriminatory stereotyping based on race and class. You may be objecting that you cannot manipulate social dispositional variables like gender and race but researchers have designed ingenious experiments to examine sexism and racism where there is a consistent script but actors of different sex and race are used (Feldman et al., 1997). If randomization is not possible, then look for quasi- or natural experiments: This when the causal mechanism has naturally been turned off and on, or there is a random-like process determining who gets exposed to the potentially causal process. An outstanding guide to identification strategies using naturally occurring randomization is Angrist and Pischke (2009). For this to work you need a variable known as an instrument which strongly influences exposure to the causal process but does not affect the outcome directly. One of their examples is trying to estimate the effect of family size on workplace participation and earning. It is not good enough to simply relate these outcomes to family size because reverse causality may be occurring (workplace outcomes affecting family size)

THE P RA C TIC E O F Q UA NTIT A T IVE M E THO DS

and there may be powerful preferences affecting having children and working that have not been measured (the unknowns). The study finds twochildren families with same sex children are very much more likely to have another child than if the children are of different sex. The sex of the first two children forms a natural experiment: it is as if an experimenter has randomly assigned some families to have two children and others to have three or more, once we take account of the sex mix (the instrument). The authors are then able to estimate the causal effect of having a third child, finding that the labour-market consequences are more likely to be severe for poor and less educated women, while husbands experience little change. As always we have to be careful in transferring results from the rather artificial world of the ‘closed’ system. Here, the effect of going from two to three children does not necessarily apply to going from zero to one child, and from one to two. Use the appropriate observational design for the problem in hand: If an experimental design is not possible, choose the most efficient design that requires the least resources to get evidential data (Jones and Subramanian, 2000). Thus, if there is a rare outcome, choose a design that samples on the basis of the outcome, a case-control design. A classic example is concern about sudden upsurge in birth defects in Germany. The researchers studied 46 cases of limbdefect babies and 300 normal birth comparisons. It was found that 41 of the cases had been exposed to the drug Thalidomide, compared to none of the controls; very strong evidence that this drug was the cause. In contrast, if there is a rare causal process operating, choose a multi-sample cohort design based on those who are and who are not exposed to the process. This design was used to follow those who did and did not work at the Sellafield reprocessing plant, finding that the father’s pre-conceptional exposure to irradiation increased the child’s risk of leukaemia. All observational designs, however, face the problem of not being able to control for unknown confounders as they are not measured A chastening case being when the best observational studies found that HRT led to a relative reduction of 50 per cent in coronary heart disease, but subsequent randomized trials found an increased risk of 30 per cent. Despite the best efforts of the observational researchers, those receiving the treatment were systematically different from those not, and considerable efforts at statistical analysis had not been sufficient to achieve equipoise. 207

j:tmsrply 2-8-2010 p:208 c:0

PAR T VI

Q UA NTI TA TI VE M E THO DS: THE ORIE S A ND PE RSPE C TIV ES

Recognize that the need for parsimony depends on what you are doing: Parsimony is often interpreted as the simplest explanation being the correct one. In data analysis this is frequently taken to imply that the model should be kept as simple as possible and have few terms. For me, the usefulness of the criteria differs by practice. In induction, this can be useful for black-box model building as, in the need to capture signal not noise, parsimony becomes our guard against over-fitting and capturing contingent fluctuations. In deduction, simplicity is often necessary to make the mathematics tractable, but if there is poor abstraction and key processes are missing, then the model is not complex enough. In retroduction, while we might start with simple models in terms of practicality, I see no reason why models should necessarily be limited to simple ones. In panel studies of people’s changing behaviour over time, it is recommended that a separate term is put in the model for each and every person (Allison, 2009) so that each individual becomes their own ‘control’, just as if all stable unobserved variables had been measured, achieving the same function as random assignment in designed experiments. Such a model may well have thousands of terms. When the problem is complex, a complex model may well be needed. Parsimony should not be used to assert a naive view of causality, or to ignore demanding technical requirements. Don’t be afraid to explore for interactions: The CR account suggests that causal mechanisms may come together either to negate an effect or to act synergistically with emergent power to create an enhanced effect. Such interplay of causal mechanisms should show up in data analysis in what is known as interactions. Their exploration is an important part of opening up the black box to see the causal pathways behind the demi-regs. Thus, there may be a main effect of smoking on bronchitis so those that do smoke have a higher risk, there may also be elevated levels of the disease in those that live in higher air pollution, but the highest risk is for those who smoke and who live in high pollution. Including interactions in the model is also a way to have different models for different subsets of people. Indeed, these moderating interactions may be the most interesting part of a study. To take one example, the Head Start programme has been applied to millions of children. It is informed by a 1962 randomized experiment involving 123 black preschoolers, 58 of whom were treated to intensive education and home visits with 65 in the control 208

group. They were followed until they were aged 27. A recent re-analysis (Anderson, 2008) found that the positive effects were driven by the results for girls; the intervention did little for boys. This example also shows that even randomized experiments do not analyse themselves. Recognize that data in open systems are ‘ficts’: Data are seen in positivism as facts, the arbiters of theory. In CR, data obtained in open systems are seen as ‘ficts’ (Olsen and Morgan, 2005) which may not be true mirror-like representations of reality, but are still useful for warranted arguments in terms of speculation and as sources for explanation. Our understanding and analysis of data are of necessity theory-laden, but that does not mean our concepts fully determine the measurement. Theories suggest where to look and what to measure but do not determine what we find. Our best hope for approaching objectivity is not only by the actions of individual researchers (through randomization and blinding to outcome and exposure), but also through the social processes of scrutiny and criticism of the broader scientific community. Do not expect textbooks to provide a cookbook recipe for your study: There are several aspects to this. First the majority of texts follow the positivist line – for exceptions, see the bibliography. Second, what we are researching should inform the choice of appropriate analysis. Sir R.A. Fisher was horrified that his statistical tests were used outside of the setting – agricultural trials – for which he had developed them. Thus, the use of the F test for judging difference in the means of interventions is based on independence of observations (Barnes and Lewin, in this volume). This assumption is guaranteed by randomization in trials, but is unlikely to be the case for observational studies. This dependency requires a more sophisticated modelling approach (Jones, Chapter 27, in this volume). Third, there is a great deal of tacit knowledge that is required in any specific application. Magnus and Morgan (1999) conducted an experiment in which an apprentice had to replicate the analysis that might have been carried out by three different experts following their published guidance. In all cases, the results were different from each other, and different from that subsequently produced by the expert! This undermines claims to researcher-independent objectivity and suggests that experience is required not only in the subtleties of the method but also of the characteristics of what is being studied.

j:tmsrply 2-8-2010 p:209 c:0

23

THE P RA C TIC E O F Q UA NTIT A T IVE M E THO DS

Be sensitive to context: Causal processes can produce different results in different settings. Pawson and Tilley (1997) have developed a realism-inspired form of evaluation which involves identifying CMO configurations where C is context, M is mechanism and O is outcomes. They argue that researchers should aim to identify the features of contexts that allow different mechanisms to be activated so as to generate particular outcomes. The theory is therefore used to derive the context that creates the ideal conditions for triggering the mechanism in question. The method seeks not generalization, but specification; what works for whom in a set of given circumstances and what is preventing change? As Deaton (2009) argues, success depends crucially not on evaluation of specific interventions but on the evaluation of theoretical mechanisms. Quantitative meta-analysis which pools information across contexts and averages the size of effect from different studies needs to be treated with considerable caution as Pawson (2006) cogently argues.

we can gain reliable, if not provable, knowledge of it. There is a need for science to reveal deeper structures and mechanisms to counter irrationality, prejudice, superstition and ‘bad science’ (Goldacre, 2008); takenfor-granted commonsense may be false knowledge. Quantitative analysis can play a part in this by aiding the collection of reliable evidence, dealing with uncertainty, using analytical techniques to identify patterns and anomalies, and setting out a logical framework to make causal inferences. The aim of emancipatory social science is not to identify and reproduce universal regularities, but to recognize and change them. I have concentrated here on causality because of its centrality to science, but in reality much quantitative work is ‘social mapping’ and this counting and estimation of prevalence and change is important too for knowing what is happening in the world. At its best quantitative work is a rich, knowledgeable and reflective practice far removed from its positivist caricature.

Apply criteria of judgement in evaluating a theory: CR recognizes that in the social world at least, there is unlikely to be a definite make-or-break study. Instead it encourages judgemental rationalism; there is fallibility (we can never prove a theory to be true for all time) but also the possibility of objective knowledge (not all theories are equally valid). This objectivity is possible because in the intransitive dimension, reality exists independently of us, and knowledge can be more or less like this reality. The process of obtaining objectivity in the transitive dimension is through competitive between-theory cross validation, a process of focused disputation between researchers. Explanation is itself therefore a social process whereby organized distrust produces trustworthy results (Campbell, 1984). Rational grounds for preferring one theory over another are their explanatory power, comprehensiveness, degree of supporting evidence, and coherence with other bodies of knowledge. Such criteria of judgement have a long history in observational research (Jones and Moon, 1987) and were responsible for regarding cigarette smoking as a cause of cancer even without the possibility of experimenting to produce a closed system. It is definitely not a matter of goodness-of-fit between the observed and predicted data.

Notes 1. A book by Richard J. Herrnstein and Charles Murray published in 1994: The Bell Curve: Intelligence and Class Structure in American Life. Regarded as controversial. 2. Pawson (2006, 19–20) provides a brief history. 3. The epistemic fallacy reduces the three domains to the one of the observable events of empiricism (Bhaskar, 1978: 36)

Annotated bibliography Angrist, J.D. and Pischke, J-F. (2009) Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press.

A lucid account of what QSS is really about (that is, not just a bag of techniques). A state-of the-art account of how to do causal empirical research.

Conclusions

Brady, H.E. and Collier, D. (eds) (2004) Rethinking Social Inquiry: Diverse Tools and Shared Standards. Lanham: Rowman and Littlefield. Something of a riposte, which argues that standards must be set from exemplary qualitative work as well as quantitative studies.

I have argued that for science to be obviously successful there is an independent external world and

Johnston, R.J. (1986) On Human Geography. Oxford: Basil Blackwell. 209

j:tmsrply 2-8-2010 p:210 c:0

PAR T VI

Q UA NTI TA TI VE M E THO DS: THE ORIE S A ND PE RSPE C TIV ES

Integrates realism social theory and quantification in the study of places. King, G., Keohane, R.O. and Verba, S. (1994) Designing Social Inquiry. Princeton: Princeton University Press. Another lucid account of what QSS is really about it caused quite a stir by laying out guidelines for conducting qualitative research from a quantitative viewpoint. Lopez, J. and Potter, P. (2001) After Postmodernism: An Introduction to Critical Realism. London: Athlone Press. Provocatively titled, this is a wide-ranging collection on the attractions of CR. Marshall, G. (1997) Repositioning Class: Social Inequality in Industrial Societies. London: Sage. A feisty account of why we are not a postmodern society requiring postmodern methods. Oakley, A. (2000) Experiments in Knowing: Gender and Method in the Social Sciences. Cambridge: Polity Press. A highly personal reflection on the need for experiments in social science by a noted feminist author. Olsen, W. (2010) Realist Methodology, 4 volumes. London: Sage. This compendious book covers the methodological implications. Pawson, R. and Tilley, N. (1997) Realistic Evaluation. London: Sage; Pawson R. (2006) Evidence-Based Policy: a Realist Perspective. London: Sage. To see how critical realism can be put into practice in policy research. Yu, C.H. (2006) Philosophical Foundations of Quantitative Research Methodology. Lanham: University Press of America. An extended critique of how quantification does not equate to positivism.

Further references Abelson, R.P. (1995) Statistics as Principled Argument. New Jersey: Lawrence Erlbaum. Allen, J. (1983) ‘Property relations and landlordism – a realist approach’, Environment and Planning D, 1: 191– 203. Allison, P.D. (2009) Fixed Effects Regression Models. Thousand Oaks, CA: Sage. 210

Anderson, M. (2008) ‘Multiple inference and gender differences in the effects of early intervention’, Journal of the American Statistical Association, 103: 1481–95. Bhaskar, R. (1975) A Realist Theory of Science. Hassocks: Harvester Press. (2nd edn, 1978.) Bhaskar, R. (1978) A Realist Theory of Science, 2nd edn. Brighton: Harvester Press. Bhaskar, R. (1979) The Possibilities of Naturalism. Brighton: Harvester Press. Byrne, D. (1998) Complexity Theory and the Social Sciences. London: Routledge. Campbell, D.T. (1984) ‘Can we be scientific in applied social science?’, in R.F Conner, D.G. Altman and C. Jackson (eds) Evaluation Studies Review Annual. Thousand Oaks, CA: Sage. pp. 26–48. Colizza, V., Barrat, A., Barthelemy, M., Valleron, A-J. and Vespignani, A. (2007) ‘Modeling the worldwide spread of pandemic influenza’, PLoS Med, 4(1): e13. Deaton, A.S. (2009) ‘Instruments of development: Randomization in the tropics, and the search for the elusive keys to economic development’, Proceedings of the British Academy, 162: 123–60. Cox N.J. and Jones K. (1981) ‘Exploratory data analysis’, in N. Wrigley and R.J. Bennett (eds) Quantitative Geography. London: Routledge, pp. 135–43. Feldman, H.A., McKinlay, J.B., Potter, D.A., Freund, K.M., Burns, R.B., Moskowitz, M.A. and Kasten, L.E. (1997) ‘Nonmedical influences on medical decision making: An experimental technique using videotapes, factorial design, and survey sampling’, Health Services Research, 32: 343–66. Goldacre, B. (2008) Bad Science. London: Fourth Estate. Harré, R. and Madden, E.H. (1975) Causal Powers. Oxford: Blackwell. Jones, K. and Almond, S. (1992) ‘Moving out of the linear rut: The possibilities of generalised additive models’, Transactions of the Institute of British Geographers, 17: 434–47 . Jones, K. and Moon, G. (1987) Health, Ddisease, and Ssociety. London: Routledge. Jones, K.and Subramanian, S.V. (2000) ‘Observational studies and design choices’, in G.M. Moon, M. Gould and colleagues (eds) Epidemiology. Buckingham: Open University Press. pp. 70–85. Kennedy, P.E. (2002) ‘Sinning in the basement: What are the rules?’, Journal of Economic Surveys, 16: 569–89. Lawson, T. (1989) ‘On abstraction, tendencies and stylised facts: A realist approach to economic analysis’, Cambridge Journal of Economics, 13: 59–78. Levitt, S.D. and Dubner, S.J (2005) Freakonomics: A Rogue Economist Explores the Hidden Side of Everything. London: Penguin Books.

j:tmsrply 2-8-2010 p:211 c:0

23

Magnus, J.R. and Morgan, M.S. (1999) Methodology and Tacit Knowledge. New York: John Wiley. Morgan, S.L. and Winship, C. (2007) Counterfactuals and Causal Inference. Cambridge: Cambridge University Press. Oakley, A. (1990) ‘Who’s afraid of the randomised controlled trial?’, Women and Health, 15: 25–59. Olsen, W.K. and Morgan, J. (2005) ‘A critical epistemology of analytical statistics: Addressing the sceptical realist’, Journal for the Theory of Social Behaviour, 35: 255–84. Pearl, J. (2009) Causality, 2nd edn. Cambridge: Cambridge University Press.

THE P RA C TIC E O F Q UA NTIT A T IVE M E THO DS

Ron, A. (2002) ‘Regression analysis and the philosophy of social sciences – a critical realist view’, Journal of Critical Realism, 1: 115–36. Russell, J. and Glauberman, N. (1995) The Bell Curve Debate: History Documents, Opinions. New York: Random House. Sayer, A. (1992) Method in Social Science. London: Routledge. Sherden, W. (1998) The Fortune Sellers. New York: Wiley. Tukey, J.W. (1962) ‘The future of data analysis’, Annals of Mathematical Statistics, 33: 1–67.

211