Qual Quant (2010) 44:881–892 DOI 10.1007/s11135-009-9241-z
Generalization practices in qualitative research: a mixed methods case study Anthony J. Onwuegbuzie · Nancy L. Leech
Published online: 9 May 2009 © Springer Science+Business Media B.V. 2009
Abstract The purpose of this mixed methods case study was to examine the generalization practices in qualitative research published in a reputable qualitative journal. In order to accomplish this, all qualitative research articles published in Qualitative Report since its inception in 1990 (n = 273) were examined. A quantitative analysis of the all 125 empirical qualitative research articles revealed that a significant proportion (i.e., 29.6%) of studies involved generalizations beyond the underlying sample that were made inappropriately by the author(s). A qualitative analysis identified the types of over-generalizations that occurred, which included making general recommendations for future practice and providing general policy implications based only on a few cases. Thus, a significant proportion of articles published in Qualitative Report lack what we call interpretive consistency. Keywords Qualitative research · Generalization · Statistical generalization · Analytic generalization · Sampling · Sample size · Interpretive consistency · Mixed methods · Sequential mixed methods analysis · Case study 1 Generalization practices in quantitative research According to The American Heritage College dictionary (1993), a generalization is a “principle, a statement, or an idea having general application” (p. 567). Consequently, “generalization” is a word that has everyday usage. Yet, when used in the field of behavioral and science research, this term is extremely contentious. Among quantitative researchers, there appears to be unequivocal agreement that the goal of quantitative research typically is to generalize findings and inferences from a representative statistical sample to the population from which the sample was drawn. This type of generalA. J. Onwuegbuzie (B) Department of Educational Leadership and Counseling, Sam Houston State University, Huntsville, TX 77341-2119, USA e-mail:
[email protected] N. L. Leech University of Colorado Denver, Denver, CO, USA
123
882
A. J. Onwuegbuzie, N. L. Leech
ization is known as statistical generalization. Yet, in order to obtain representative statistical samples, random sampling techniques (i.e., simple random sampling, stratified random sampling, cluster random sampling, systematic random sampling) should be used. Indeed, ideally, samples in quantitative research typically should be randomly selected. Unfortunately, the majority of quantitative research studies in the social and behavioral sciences do not utilize random samples in quantitative studies (Shaver and Norton 1980a,b), even though “inferential statistics is based on the assumption of random sampling from populations” (Glass and Hopkins 1984, p. 177). In fact, the rampant use of nonrandom samples suggests that the majority of quantitative data in the social and behavioral sciences are not normally distributed (Micceri 1989), which is an important assumption that underlies most null hypotheses significance tests. Second, the majority of quantitative studies involve sample sizes that are too small (i.e., low statistical power) to detect statistically significant relationships or differences that really exist. Conducting null hypothesis tests of significance with low statistical power not only adversely affects internal validity (i.e., integrity of findings), it also reduces external validity—that is, low statistical power affects generalizability (Onwuegbuzie 2003a). Third, because it is rare for everyone in the target population (i.e., the larger population to whom the results are to be generalized) to be available to participate in quantitative studies, quantitative researchers typically are compelled to select their samples from the accessible population rather than from the target population. The accessible population is the group of participants that is available to the researcher for participation in the study (Johnson and Christensen 2004). Although the goal in quantitative research is to generalize from the sample to the target population, even if a large and random sample is selected, usually, researchers are only justified, at best, in generalizing their study findings to the accessible population. It is only justified to generalize from the sample to the target population if the accessible population is representative of the target population, which is rarely the case (Johnson and Christensen 2004). Thus, as noted by Johnson and Christensen (2004), “generalizing the results of a study to the target population is frequently a tenuous process because the sample of participants used in most studies are not randomly selected from the target population” (p. 244). Thus, lack of random sampling combined with the use of inadequate sample sizes seriously calls into question the extent to which findings and interpretations can be generalized justifiably from the sample to the underlying population. In other words, quantitative research studies typically are characterized by findings with low external validity. Yet, unfortunately, many quantitative researchers make generalizations to the target population in a rote, mechanical manner without reflecting carefully on the extent to which their sample is statistically representative.
2 Generalization practices in qualitative research As contentious as the concept of generalization is in quantitative research, it is even more controversial in qualitative research. A few researchers believe that approximate generalizations can be made in qualitative studies. Stake (1997) coins the phrase naturalistic generalization to refer to the process of making generalizations based on similarity (e.g., similarity of other people, settings, times, and context). Yin (2003) also entertains the possibility of generalizations being made in qualitative research. He refers to such generalizations as representing replication logic, which is similar to that used in experimental studies. However, the majority of qualitative researchers agree that the goal of interpretivist research is not to make statistical generalizations. Rather, they argue that the goal of qualitative
123
Generalization practices in qualitative research
883
research is to obtain insights into particular educational, social, and familial processes and practices that exist within a specific location and context (Connolly 1998). In other words, they claim that qualitative researchers study phenomena in their natural settings and attempt to make sense of, or to interpret them, with respect to the meanings people bring to them (Denzin and Lincoln 2000). Although not readily acknowledged by some qualitative researchers, many qualitative studies, if not most, involve making one of two types of generalizations: analytic generalizations or case-to-case transfer (Curtis et al. 2000; Firestone 1993; Kennedy 1979; Miles and Huberman 1994). Analytic generalizations are “applied to wider theory on the basis of how selected cases ‘fit’ with general constructs” (Curtis et al. 2000, p. 1002). Consequently, they are similar to what Maxwell (1996) calls internal generalization, which represents generalizations within the setting or group studied. Case-to-case transfer involves making generalizations from one case to another (similar) case (Firestone 1993; Kennedy 1979). These two types of generalizations (i.e., analytic generalizations and case-to-case transfers) are very common in qualitative research, with analytic generalizations being the most popular. Moreover, as noted by Onwuegbuzie (2003b), qualitative researchers “generalize words and observations… to the population of words/observations (i.e., the truth space) representing the underlying context” (p. 400). Similarly, Williamson Shafer and Serlin (2005) stated the following: The observations in any qualitative study are necessarily a subset of all other things that might have been observed using a particular set of tools and techniques in a particular setting. From this subset of all possible observations, a further subset is extracted to form the basis of qualitative inferences, since no qualitative analysis accounts for all of the observational data in equal measure. (p. 20) Despite the claims by many that the goal of qualitative research is not to generalize beyond a sample to some underlying population, some qualitative researchers find it difficult to resist the temptation to generalize their findings (e.g., thematic representations) to some population (Onwuegbuzie and Daniel 2003). As noted by Onwuegbuzie and Leech (2007), Such practices are flawed unless a representative sample has been selected. Whenever a theory is being developed, some type of generalization clearly has taken place. If generalization is not the goal, then he/she should only outline a theory in terms of the particular participant(s), setting, context, location, time, event, incident, activity, experience, and/or processes, as well as with respect to the specific researcher (assuming that the researcher is serving as the instrument). (p. 115) However, to date, it is not known how prevalent it is for qualitative researchers to make statistical generalizations of their findings, whether deliberately or inadvertently, beyond their study participants. In fact, to date, no researcher appears to have investigated the extent to which statistical generalizations are made in qualitative studies. Thus, the purpose of the present paper is to examine generalization practices in qualitative research published in a qualitative journal. It was hoped that findings from the present investigation would provide evidence regarding interpretive consistency in qualitative research, wherein interpretive consistency represents the consistency between the interpretations made by the researcher and the procedures/design of the study (e.g., purpose of study, research question, sample size, sampling scheme, research design, data analysis techniques).
123
884
A. J. Onwuegbuzie, N. L. Leech
3 Method 3.1 Sampling scheme The present study was mixed methods in nature because it utilized both quantitative and qualitative techniques to analyze the types of generalizations made in qualitative research articles. Onwuegbuzie and Collins (2007) have provided a useful framework for helping mixed methods researchers identify an optimal sampling design. This model provides a typology in which mixed methods sampling designs can be classified according to (a) the time orientation of the phases (i.e., whether the qualitative and quantitative components occur simultaneously or sequentially) and (b) the relationship between the qualitative and quantitative samples (i.e., identical vs. parallel vs. nested vs. multilevel). An identical relationship implies that exactly the same sample members are involved in both the qualitative and quantitative phases of the study. A parallel relationship indicates that the samples for the qualitative and quantitative components of the research are different but are drawn from the same population of interest. A nested relationship denotes that the sample members selected for one phase of the study represent a subset of those selected for the other component of the research. Finally, a multilevel relationship involves the use of two or more sets of samples that are extracted from different levels of the study. The two criteria, time orientation and sample relationship, yield eight different types of major sampling designs that mixed methods researchers have at their disposal. In the current study, we utilized a Sequential Design using Identical Samples in order to conduct our study. In this sampling design, the quantitative phase informed the qualitative phase. The data for this study were obtained from the journal entitled, The Qualitative Report (TQR), which was established in 1990. This journal was selected (i.e., criterion purposive sampling) because it represents a peer-reviewed journal that has a tradition of publishing qualitative studies (The Qualitative Report 2005). At the time of writing, TQR has published 11 volumes that represent 36 issues. The current issue is Volume 11, Issue 1. The number of articles per issue has ranged from 5 to 10 (M = 7.58, S D = 2.12), yielding a total of 273 articles published in 16 years. Table 1 contains the number of articles contained in the 36 issues. Consistent with our study purpose, only applied (i.e., empirical) articles were analyzed. That is, monographs, book reviews, essays, editorials, and methodological articles were excluded from the analysis. This led to the examination of 125 empirical research articles, which represented 45.8% of the total number of articles published in the journal’s history. 3.2 Research design Using Leech and Onwuegbuzie’s (2005) framework, the mixed methods research design represented a fully mixed sequential dominant status design. This design was fully mixed in the current investigation because qualitative and quantitative research approaches were mixed within and across several stages (i.e., research formulation, research planning, and research implementation) of the research process. In this design, the quantitative and qualitative phases occurred sequentially across the stages. Finally, the qualitative phase was given more weight. For the quantitative research phase, a descriptive research design was used (Johnson and Christensen 2004). The qualitative phase involved a case study. Specifically, a two-level case study was utilized. The first level, involving choice of the journal (i.e., TQR), represented an instrumental case design (Stake 2005). According to Stake (2005), in instrumental case designs, a particular case is examined primarily to provide insights into an issue such that the
123
Generalization practices in qualitative research
885
Table 1 Types of articles published in qualitative report (1990–2006) Volume # (series #)
Total no. of articles
No. of methodological articles
No. of book/ journal reviews
No. of essays
No. of editorials
No. of empirical studies
11(4) 11(3) 11(2)
10
11(1)
10
2
0
0
0
10(4)
10
5
0
0
0
5
10(3)
10
4
0
0
0
6
10(2)
10
5
0
1
0
4
10(1)
10
3
0
0
0
7
9(4)
10
3
0
0
0
7
9(3)
10
2
0
0
0
8
9(2)
10
3
0
0
0
7
9(1)
10
5
0
0
0
5
8(4)
10
4
0
0
0
6
8(3)
10
2
0
0
0
8
8(2)
10
6
0
1
0
3
8(1)
9
7
0
0
0
2
7(4)
8
0
0
0
0
2
7(3)
8
3
0
0
0
3
7(2)
7
3
0
0
0
4
7(1)
5
4
0
0
0
1
6(4)
5
1
0
0
0
4
6(3)
5
1
0
0
0
4
6(2)
5
3
0
0
0
2
6(1)
5
2
0
0
0
3
5(3/4)
6
3
0
0
1
2
5(1/2)
9
6
1
0
0
2
4(3/4)
8
6
0
0
0
2
4(1/2)
8
3
0
0
0
5
3(4)
5
3
0
0
0
2
3(3)
6
4
1
0
0
1
3(2)
5
4
0
0
0
1
3(1)
8
6
0
2
0
0
2(4)
5
4
0
0
0
1
2(3)
5
5
0
0
0
0
2(2)
5
5
0
0
0
0
2(1)
8
6
1
1
0
0
1(4)
6
2
0
1
2
1
1(2/3)
5
3
0
1
1
0
7
2
3
1
1
0
273
129
6
8
5
125
1(1) Total
8
123
886
A. J. Onwuegbuzie, N. L. Leech
case is of secondary interest, playing supportive role. In this investigation, it was expected that the case would facilitate understanding of the generalization practices of some qualitative researchers. The second level, involving the selected empirical articles (i.e., n = 125), represented a multiple or collective case design, which is an instrumental case designs that is extended to several cases (Stake 2005) and which facilitates cross-case analyses. 3.3 Analysis A sequential mixed methods analysis (SMMA; Onwuegbuzie and Teddlie 2003; Tashakkori and Teddlie 1998) was undertaken to analyze the empirical qualitative research articles. This analysis utilized qualitative and quantitative data analytic techniques in a sequential manner, commencing with quantitative analyses, followed by qualitative analyses that built upon the quantitative analyses. Using Greene et al.’s (1989) framework, the purpose of the mixed methods analysis was development, whereby the results from one data-analytic method informed the use of the other method. The SMMA consisted of two stages. In the first stage—the quantitative stage–all 125 empirical research articles were classified according to the type of generalization made (i.e., statistical, generalization vs. analytical generalization vs. case-to-case transfer vs. no generalization). The frequency rates of each type of generalization then were computed. Double coding (Miles and Huberman 1994) was used for categorization verification, which took the form of inter-rater reliability. Consequently, the verification component of categorization was empirical (Constas 1992). Specifically, two of the researchers independently classified the same articles and then compared their codings. Kappa’s index, which measures the extent to which observers achieve the possible agreement beyond any agreement than can be expected to occur by chance alone (Cohen 1960), was used to assess the inter-rater reliability of the two sets of classifications. Because a quantitative technique (i.e., inter-rater reliability) was utilized as a validation technique, in addition to being empirical, the verification component of categorization was technical (Constas 1992). The verification approach was undertaken a posteriori (Constas 1992). The following criteria were used to interpret the Kappa coefficient: