Document not found! Please try again

Juror Need for Cognition and Sensitivity to ... - Wiley Online Library

10 downloads 0 Views 156KB Size Report
summary in which the plaintiff's expert presented a study that varied in ... Ratings of expert and plaintiff credibility, plaintiff trustworthiness, and expert evi-.
Juror Need for Cognition and Sensitivity to Methodological Flaws in Expert Evidence1 Bradley D. McAuliff2

Margaret Bull Kovera

California State University, Northridge

John Jay College of Criminal Justice, City University of New York

This study examined whether need for cognition (NC) moderated jurors’ sensitivity to methodological flaws in expert evidence. Jurors read a sexual harassment trial summary in which the plaintiff’s expert presented a study that varied in ecological validity, general acceptance, and internal validity. High NC jurors found the defendant liable more often and evaluated expert evidence quality more favorably when the expert’s study was internally valid vs. missing a control group; low NC jurors did not. Ecological validity and general acceptance did not affect jurors’ judgments. Ratings of expert and plaintiff credibility, plaintiff trustworthiness, and expert evidence quality were positively correlated with verdict. Theoretical implications for the scientific reasoning literature and practical implications for trials containing psychological science are discussed.

Now more than ever before, citizens serving as jurors in civil or criminal trials are likely to confront complex scientific evidence when deciding cases. Nearly two thirds (65%) of state court judges responding to a national survey indicated that they had experienced DNA evidence in their courtrooms (Gatowski et al., 2001). The presence of psychological science in court has become increasingly routine as well. Social or behavioral scientists constituted nearly one quarter of all scientists testifying in U.S. criminal appellate cases involving expert testimony from 1988 to 1998 (Groscup, Penrod, Studebaker, Huss, & O’Neil, 2002). 1 This research was conducted in partial fulfillment of the requirements for the first author’s doctoral degree at Florida International University under the direction of the second author. The paper was the recipient of the first-place 2000 American Psychology–Law Society (APLS) Doctoral Dissertation Award. Portions of the research were presented at the joint meeting of the APLS and the European Association for Psychology and Law, Dublin, Ireland, July 1999. The research was supported by a grant from the National Science Foundation (SBE #9711225) to the second author. The authors thank the jury administration staff and personnel at Broward County Courthouse, 17th Judicial District, Fort Lauderdale, FL (Honorable Robert L. Andrews, Dolly Gibson, Norman Houghtaling, Pat Todaro, Lisa Muggeo, Audrey Edwards, Marvin Edelstein, and Allison Mitchel) for granting us access to the jury pool. We also thank dissertation committee members Brian Cutler and Ronald Fisher for their helpful suggestions. 2 Correspondence concerning this article should be addressed to Bradley D. McAuliff, Department of Psychology, California State University, Northridge, 18111 Nordhoff Street, Northridge, CA 91330-8255. E-mail: [email protected]

385 Journal of Applied Social Psychology, 2008, 38, 2, pp. 385–408. © 2008 Copyright the Authors Journal compilation © 2008 Blackwell Publishing, Inc.

386 MCAULIFF AND KOVERA Despite the increasingly prominent role of scientific evidence in trials, little is known regarding how jurors reason about complex expert testimony involving statistical, probabilistic, and experimental findings. Concern about jurors’ ability to reason about scientific evidence has increased in the wake of three Supreme Court rulings on the admissibility of expert evidence (Daubert v. Merrell Dow Pharmaceuticals, Inc., 1993; General Electric Co. v. Joiner, 1997; Kumho Tire Co. v. Carmichael, 1999). Prior to these rulings, scientific methodologies and conclusions that were generally accepted within the relevant scientific community were admissible at trial (Frye v. United States, 1923). However, the Daubert decision emphasized judges’ gatekeeping role and charged them with the more difficult task of evaluating the reliability and validity of scientific research. The Supreme Court’s faith in judges’ ability to be sophisticated consumers of science may have been misplaced. Judges often lack the scientific literacy required for a Daubert analysis (Gatowski et al., 2001) and have difficulty identifying methodologically flawed expert testimony (Kovera & McAuliff, 2000). Despite these limitations, judges overwhelmingly support their role as gatekeepers and appear confident in their gatekeeping abilities (Gatowski et al., 2001; Shuman, Whitaker, & Champagne, 1994). Because judges may admit methodologically flawed science at trial, it is important to determine whether jurors are sensitive to variations in the validity of expert scientific evidence. In other words, can jurors identify junk science, even if judges cannot?

Scientific Reasoning Ability Laypeople generally have difficulty reasoning about scientific and methodological issues. When judging the probability of certain outcomes, people prefer anecdotal information and underutilize base-rate information (Bar-Hillel, 1980; Kahneman & Tversky, 1973; Simonson & Nye, 1992; Stanovich & West, 1998). Laypeople often fail to recognize the unreliability of results obtained through the use of small samples (Fong, Krantz, & Nisbett, 1986; Kahneman & Tversky, 1972) and are insensitive to sample bias when generalizing results to the population, even when the atypical nature of the sample is highlighted (Hamill, Wilson, & Nisbett, 1980). Without specialized tutoring, people have failed to recognize missing comparative or control-group information when evaluating certain scientific claims, such as the relation between fluoride use and tooth decay (Gray & Mill, 1990; Mill, Gray, & Mandel, 1994). Finally, laypeople vary considerably in their ability to apply various statistical reasoning skills (e.g., law of large numbers,

JUROR NC AND EXPERT EVIDENCE

387

regression, base-rate principles) when confronting a series of inductive reasoning problems (Jepson, Krantz, & Nisbett, 1983). Laypeople may be unable to reason about scientific issues in legal contexts as well. Mock jurors underutilize expert probabilistic testimony, compared to Bayesian norms (Faigman & Baglioni, 1988; Kaye & Koehler, 1991; Schklar & Diamond, 1999; Thompson & Schumann, 1987) and are reluctant to base verdicts on statistical evidence alone (Niedermeier, Kerr, & Messé, 1999; Wells, 1992). Mock jurors also have difficulty comprehending expert testimony on statistical matters (Faigman & Baglioni, 1988). Only 14% of mock jurors in that study correctly answered two questions designed to assess their understanding of a statistical expert’s testimony, and 43% provided incorrect answers to both questions. Extrapolating from this body of research, it seems reasonable to predict that jurors will have difficulty reasoning about psychological science in a relatively sophisticated manner. In short, they may be unable to differentiate valid research from junk science. This possibility raises an intriguing question: What other characteristics of expert evidence influence jurors’ evaluations and judgments? Information-processing models from the socialcognitive psychological literature on persuasion provide a much needed theoretical framework to predict how jurors make decisions when confronting scientific evidence.

Dual-Process Models of Persuasion and Juror Decision Making Two information-processing models are useful for understanding how jurors evaluate information presented at trial: the heuristic-systematic model (HSM; Chaiken, 1980; Chaiken, Liberman, & Eagly, 1989), and the elaboration likelihood model (ELM; Petty & Cacioppo, 1986). According to those models, individuals evaluate persuasive messages using two cognitive processes. Systematic (i.e., HSM) or central (i.e., ELM) processing is characterized by a high level of cognitive effort and entails careful scrutiny of the persuasive message’s content. When engaging in systematic processing, people evaluate the quality of the arguments presented in the persuasive message. Systematic processors are more likely to adopt the position advocated in the persuasive message when it contains valid, high-quality arguments than when it does not (Petty, Cacioppo, & Goldman, 1981). Individuals who engage in heuristic or peripheral processing do not scrutinize the quality of the persuasive arguments. Instead, they rely on mental shortcuts or decision rules when evaluating a persuasive message. Certain cues associated with the persuasive message (e.g., length or number of arguments; Petty & Cacioppo, 1984), its source (e.g., expertise, likability, physical

388

MCAULIFF AND KOVERA

attractiveness; Chaiken & Maheswaran, 1994), and the audience (e.g., positive or negative audience reactions; Axsom, Yates, & Chaiken, 1987) may affect message evaluation in heuristic processing. What determines whether someone is likely to engage in systematic or heuristic processing? Both the HSM and ELM propose that people generally are motivated to hold correct attitudes. Two factors that moderate the extent to which people process a message systematically are ability and motivation. To engage in systematic processing, an individual must be able and motivated to do so. If someone lacks ability or motivation, that individual is likely to rely on heuristic processing when making message-related judgments.

Systematic Processing of Psychological Science One variable that could affect jurors’ motivation to reason about psychological science is the need for cognition (NC). Certain stable individual differences exist in the degree to which people engage in and enjoy effortful cognitive endeavors (Cacioppo & Petty, 1982). High-NC individuals naturally tend to seek, acquire, and think about information to understand the world around them. Low-NC individuals tend to rely on other methods of acquiring information that are less cognitively taxing (e.g., adopting the opinions of others, using cognitive heuristics, engaging in social comparison processes; Cacioppo, Petty, Feinstein, & Jarvis, 1996). NC is also relevant to the systematic processing and elaboration of persuasive messages. High-NC individuals are more likely to think about and elaborate cognitively on issue-relevant information than low-NC individuals (Cacioppo, Petty, Kao, & Rodriguez, 1986). Meta-analytic results examining the NC ¥ Argument Quality interaction in 11 studies revealed that the argument quality of persuasive messages exerted a greater influence on the attitudes of individuals high versus low in NC (Cohen’s d = .31). In 5 of the 11 studies, researchers had asked participants to directly evaluate the quality of the persuasive message, and the combined results of those studies indicated that argument quality had a greater effect on high- versus low-NC individuals’ ratings (d = .54) as well. Previous research has shown that NC can affect jurors’ judgments and reactions to expert evidence. Bornstein (2004) varied the presence of experimental and anecdotal expert testimony in a simulated personal injury case. Low-NC jurors believed that the defendant was less likely to have caused the plaintiff’s injuries when a defense expert presented anecdotal evidence than when he did not. However, no such differences emerged for high-NC jurors. Leippe, Eisenstadt, Rauch, and Seib (2004) examined how NC, case strength, and eyewitness expert evidence influenced mock jurors’ verdicts in a

JUROR NC AND EXPERT EVIDENCE

389

simulated murder case. Juror NC interacted with case strength such that moderate-NC jurors convicted more often than did high- and low-NC jurors when the case against the defendant was strong. Consistent with these research findings, we expected that NC would moderate jurors’ sensitivity to the validity of psychological science presented at trial in our study.

Heuristic Processing of Psychological Science Jurors who are unable or unmotivated to systematically reason about psychological science may instead rely on heuristics when evaluating research quality. Heuristics involving source-related cues (e.g., an expert’s credentials or trustworthiness) influence jurors’ decisions when confronting highly complex expert testimony (Cooper, Bennett, & Sukel, 1996; Cooper & Neuhaus, 2000). Perhaps message-related cues influence jurors’ decisions under conditions that promote heuristic processing as well. One message-related cue that jurors might use when evaluating an expert’s research involves the representativeness of the research. When people find it difficult to make a decision, they often rely on a representativeness heuristic to simplify the decision-making task (Tversky & Kahneman, 1974). If relying on this heuristic, jurors might evaluate the validity or the usefulness of psychological science based on whether characteristics of the research match the case facts. In other words, their decisions would be influenced by the ecological validity of scientific research. For example, jurors may judge a study more favorably when it contains a sample of participants who are similar to members of the population to which the expert wishes to generalize the research findings than when it does not. Jurors using the representativeness heuristic may routinely ignore or undervalue psychological experiments that use college students as participants. A second heuristic that jurors may use when evaluating an expert’s research is that consensus implies correctness. When processing persuasive messages, people rely on others’ evaluations of message quality (Axsom et al., 1987) and consensus information can influence people’s evaluations of message quality under conditions that produce heuristic processing (Maheswaran & Chaiken, 1991). In the legal domain, jurors may rely on information about a study’s acceptance within the scientific community when evaluating its quality. Jurors may reason that research is methodologically sound if it has been published in a peer-reviewed journal and, therefore, has been evaluated favorably by qualified members of the relevant scientific community. In contrast, jurors may view research negatively that has not been published or generally accepted. Although this heuristic may lead jurors to make reasonable decisions about psychological science most of the time,

390 MCAULIFF AND KOVERA jurors may be led astray by evidence of general acceptance in some instances (for an example of how the general acceptance of a phenomenon—the reliability of show-ups—was not supported by research, see Kassin, Ellsworth, & Smith, 1989). A recent study varied the message-related cues of ecological validity (i.e., representativeness) and general acceptance (i.e., consensus information) to determine their influence on mock jurors’ decisions in a simulated hostile work environment case (Kovera, McAuliff, & Hebert, 1999). Jurors in that study relied on the heuristic cues when evaluating the trustworthiness and credibility of certain witnesses. However, their evaluations of evidence quality were unaffected by the message-related cues. The study’s intended manipulation of motivation did not work as planned, so it is not clear whether differences in juror motivation can increase sensitivity to methodological flaws in expert evidence.

Overview We examined three questions related to jurors’ evaluations of psychological science presented by an expert at trial: • Are jurors sensitive to variations in the internal validity of an expert’s study? • Does NC moderate jurors’ sensitivity to methodological flaws in an expert’s study? • If jurors are unable to detect invalid research, do they instead rely on heuristic cues when evaluating the research? To answer these questions, we varied the study’s internal validity and different heuristic cues associated with the expert’s study (ecological validity, general acceptance). We also assessed jurors’ NC to determine whether that variable moderated jurors’ sensitivity to the quality of the expert’s study. We made several predictions that flow directly from persuasion theory. First, we predicted the following: Hypothesis 1. Jurors will be sensitive to variations in internal validity, but their sensitivity will vary as a function of NC. We anticipated that high-NC jurors would be more sensitive to variations in the methodological quality of the study than would low-NC jurors as a result of their increased motivation (enjoying cognitively complex tasks) and ability (increased exposure to and familiarity with cognitively complex tasks). A significant NC ¥ Internal Validity interaction would support this prediction. High-NC jurors should judge the internally valid study more favorably than

JUROR NC AND EXPERT EVIDENCE

391

the flawed studies. However, low-NC jurors’ ratings for the studies should not differ. We hypothesized similar effects across all dependent measures. In addition, we predicted the following: Hypothesis 2. Jurors will be sensitive to heuristic cues related to the expert evidence, but NC will moderate their sensitivity. We predicted that low-NC jurors would rely more heavily on heuristic cues associated with the expert’s study than would high-NC jurors. A significant NC ¥ Ecological Validity interaction would support this hypothesis such that low-NC jurors should provide more favorable ratings when the expert’s study is high versus low in ecological validity. We anticipated a similar NC ¥ General Acceptance interaction. Because we expected high-NC jurors to systematically process the expert evidence, we did not predict that their decisions would vary as a function of the study’s ecological validity and general acceptance. We hypothesized similar effects across all dependent measures. Method Participants Participants were 162 U.S. citizens (82 male, 80 female; M age = 42 years) reporting for jury duty in south Florida who volunteered to participate in a study about juror decision making. Most participants were White (69%), married (59%), and reported a gross family income of $60,000 or less (68%). Of the participants, 51% indicated that their highest level of education was a high school diploma or college degree and 84% who attended college had nonscientific majors (e.g., business, English, history, theater). Most of the potential jurors (74%) had not served on a civil or criminal jury before. They participated in exchange for a meal voucher at a restaurant located in the courthouse complex. Trial Stimulus Jurors read a 15-page summary of a simulated civil case in which the plaintiff alleged that she was the victim of gender discrimination as a result of a hostile work environment. The fact pattern of the trial simulation was derived from an actual case (Robinson v. Jacksonville Shipyards Inc., 1991). Certain facts were modified to prevent the possibility that participants might recognize the case from media coverage. The trial simulation consisted of opening statements and closing arguments from both attorneys, direct- and

392 MCAULIFF AND KOVERA cross-examined testimony from five witnesses, and standard Florida judicial instructions. The plaintiff was the sole female mechanic who worked with a male maintenance crew in a trucking company service garage. She alleged that sexual materials were displayed throughout the workplace and that she was the target of unwelcome sexual advances. A female coworker corroborated the plaintiff’s allegations. Upon cross-examination, the plaintiff admitted that she sometimes used crude language and told sexual jokes to her male coworkers. A social psychologist who testified on the plaintiff’s behalf discussed conditions that increase the likelihood of sexual harassment (i.e., rarity of women in the workplace, paucity of information available when evaluating workers for promotion, ambiguity of evaluation criteria, and sexualized working environment). There were two defense witnesses who testified. A shift supervisor claimed that the plaintiff never complained about the sexual materials until she was reprimanded for her tardiness and absenteeism from work and that once the plaintiff complained about the posters, he removed all of them. He added that the plaintiff often used profanity and joked about sexual matters with her male coworkers. A midlevel administrator for the company testified that the plaintiff had shown him examples of materials that she claimed were offensive, but that those materials were similar to many ads appearing on television. The administrator stated that he offered to follow up on the issue, but that the plaintiff did not provide the names of the alleged perpetrators. Experimental Manipulations The expert described a study that she had conducted on the effects of sexually suggestive materials on men’s behavior toward women. This study was based on an experiment conducted by Rudman and Borgida (1995), who found that men who viewed sexualized commercials sat closer to a female confederate, evaluated her more negatively, and asked her a greater number of sexually inappropriate questions compared to men who viewed nonsexual commercials. Moreover, the female confederate in that experiment, who was blind to experimental condition, rated the men who had viewed the sexualized commercials to be more sexually motivated than those who had viewed the nonsexual commercials. Within the expert’s description of her study, we manipulated its ecological validity, general acceptance, and internal validity. Ecological validity. In the high ecological validity condition, participants were employees at a trucking company similar to the plaintiff’s workplace. In the low ecological validity condition, participants in the expert’s study were college undergraduates and, therefore, were less similar to the population to which the expert wished to generalize her findings.

JUROR NC AND EXPERT EVIDENCE

393

General acceptance. In the generally accepted version, a prestigious peerreviewed psychology journal had published the study and other social scientists in the field had cited the study favorably. In the not generally accepted version, the expert had just completed her study. Therefore, she had not yet submitted it for publication and there was no evidence that other researchers had evaluated her research favorably. Internal validity. The first version of the study was identical to Rudman and Borgida’s (1995) original study and contained no threats to internal validity (i.e., valid condition). Our assessment that this was a valid study is supported by the facts that (a) it was published in a peer-reviewed journal; and (b) it won SPSSI’s Gordon Allport prize for the best intergroup relations paper of the year. Unlike participants in the valid version who viewed either sexualized or nonsexual commercials, participants in the invalid version viewed only the sexualized commercials. Thus, the invalid version did not include an appropriate control group. Because the invalid version of the expert’s study did not include an appropriate control group, any conclusions regarding the effects of viewing sexualized commercials were invalid.

Predictor Variable Jurors completed the 34-item Need for Cognition Scale (NCS; Cacioppo & Petty, 1982) by indicating their level of agreement with each item using a 9-point, Likert-type scale ranging from -4 (very strong disagreement) to +4 (very strong agreement). Sample items include “Thinking is not my idea of fun” (reverse-scored) and “I prefer complex to simple problems.” The NCS is a highly reliable measure, with Cronbach’s alphas exceeding .84 across six studies; and split-half and test–retest reliabilities averaging .83 and .87, respectively (Cacioppo et al., 1996). The NCS was introduced to the jurors as a voir dire questionnaire and was completed prior to their receiving the trial stimulus and additional dependent measures. We calculated a NCS score for each participant by reverse-scoring the appropriate items and then summing his or her total responses to the NCS. We combined the NCS scores of all jurors and performed a median split, classifying jurors who fell below the median (Mdn = 52) as low NC and those who fell above the median as high NC. Because five scores fell directly on the median, three were randomly assigned to the low NC condition and two to the high NC condition. Table 1 presents the number of low- versus high-NC participants in the ecological validity, general acceptance, and internal validity conditions.

394 MCAULIFF AND KOVERA Table 1 Observed Cell Sizes for Interactions Between Need for Cognition, Ecological Validity, General Acceptance, and Internal Validity

Ecological validity Student sample Trucking company sample General acceptance Unpublished Published and cited Internal validity Missing control group Valid

Low NC

High NC

40 38

40 44

43 35

37 47

36 42

46 38

Dependent Measures Jurors decided whether the plaintiff had demonstrated by a preponderance of the evidence that the trucking company constituted a hostile working environment using a dichotomous scale (i.e., Defendant is liable/Defendant is not liable). The jurors evaluated the plaintiff and the expert on eight 7-point bipolar adjective pairs: believable/not believable, certain/uncertain, convincing/ not convincing, credible/not credible, intelligent/unintelligent, good/bad, moral/ immoral, and respectable/not respectable. Jurors’ ratings were averaged across these items to form a single index of credibility for each witness (as = .88 and .85 for plaintiff and expert, respectively). Jurors also rated the plaintiff and expert on three additional 7-point adjective pairs: honest/ dishonest, sincere/insincere, and trustworthy/untrustworthy. These ratings were averaged to form a single index of witness trustworthiness for each witness (as = .89 and .91 for plaintiff and expert, respectively). Some of the items were reverse-scored to protect against response bias. Data were recoded so that higher numbers on the final scales represent more positive witness evaluations. Jurors rated the quality of the expert’s study using a series of 7-point scales. Specifically, jurors judged the study’s reliability, its validity, appropriateness of the dependent measures, appropriateness of the study’s procedures, ability of the findings to determine the effects of sexually explicit materials, and generalizability of the study’s findings. We averaged jurors’

JUROR NC AND EXPERT EVIDENCE

395

responses to those six items to create a single index of expert endence quality, with higher numbers representing more positive evaluations (a = .87). There were three additional items that served as manipulation checks for the ecological validity, general acceptance, and internal validity variables. Jurors used 7-point scales to indicate how similar the participants in the expert’s study were to the workers at the plaintiff’s trucking company (i.e., ecological validity) and the degree to which the expert’s findings had been accepted by the psychological community (i.e., general acceptance). Higher numbers represent more positive evaluations for both items. With respect to internal validity, jurors responded to a forced-choice question asking what type of television commercials the expert included in her study (i.e., sexualized, nonsexualized, or both).

Procedure Upon the potential jurors’ arrival at the courthouse, the Chief Administrative Judge welcomed them and the jury administration staff provided a brief overview of what jury service would entail. At the conclusion of the orientation, the experimenter made an announcement inviting jurors to participate in a study about juror decision making. Those who wished to participate reported to a lounge adjacent to the jury assembly room.3 After providing their informed consent, jurors received the trial stimulus and dependent measures. The jurors did not deliberate or confer with one another at any point during the study.

Design Our study consists of a 2 (Ecological Validity: high vs. low) ¥ 2 (General Acceptance: yes or no) ¥ 2 (Internal Validity: yes or no) ¥ 2 (Need for Cognition: high vs. low) fully crossed factorial design. We randomly assigned participants to experimental conditions, with two exceptions. We arranged for an equal number of men and women to participate in each condition, and jurors were classified as either high or low NC after the experiment ended. 3 We estimate that approximately 80% of the jurors who heard the announcement and later had the opportunity to participate actually completed our study. These participants were jurors who spent extended time in the juror lounge because (a) they were never seated on venire panels; (b) they were seated on venire panels after lunch recess; or (c) they were seated on venire panels in the morning, but were not selected as jurors for trials and then returned to the juror lounge as members of the general jury pool.

396 MCAULIFF AND KOVERA Results Manipulation Checks Our experimental manipulations were successful. Jurors4 who read the version of the study that included a sample of trucking employees judged those participants to be more similar to the workers at the plaintiff’s workplace than did jurors who read the study that used an undergraduate psychology student sample, F(1, 158) = 7.78, p < .01, partial h2 = .05 (Ms = 4.24 and 3.48, respectively). Jurors who read the version of the study that was published in a peer-reviewed journal and cited in psychology textbooks found that study to be more generally accepted within the scientific community than jurors who read the version that was not published and had not been cited by others in the field, F(1, 157) = 22.95, p < .001, partial h2 = .13 (Ms = 5.02 and 3.85, respectively). Jurors who read the internally valid version were more likely to report that participants had viewed both sexualized and nonsexual commercials (92%) than that participants had viewed only the sexualized (4%) or only the nonsexual (4%) commercials, c2(2, N = 79) = 124.05, p < .001, f = 1.25. Similarly, jurors who read the invalid version were more likely to report that participants had viewed only sexualized commercials (84%) than that participants had viewed only the nonsexual (2%) commercials or both types of commercials (14%), c2(2, N = 79) = 91.16, p < .001, f = 1.07.

Data Analysis We analyzed the data using a series of 2 (Ecological Validity: high vs. low) ¥ 2 (General Acceptance: yes or no) ¥ 2 (Internal Validity: yes or no) ¥ 2 (Need for Cognition: high vs. low) factorial ANOVAs. Simple main effects tests and simple comparisons were conducted when appropriate. Unless otherwise indicated, the mean differences reported met traditional levels of statistical significance ( p ⱕ .05). Effect sizes were reported as partial h2. We also analyzed jurors’ dichotomous verdicts using logistic regression. Finally, we examined the correlations among jurors’ verdicts, their witness ratings, and their judgments of study quality. 4 Jurors were the participants in our study. However, they evaluated an expert’s study that included trucking company employees or college student participants. Our manipulation checks had to ensure that (a) juror participants in our study rated the trucking company employee participants in the expert’s study as being more similar to the plaintiff’s coworkers, compared to the college student participants in the expert’s study; and (b) juror participants in our study reported that participants in the expert’s study viewed either sexualized and nonsexualized materials or only sexualized materials.

Proportion of liable verdicts

JUROR NC AND EXPERT EVIDENCE

.60 .50 .40

397

.50 .42 .32

.30

.29

.20

No control Control

.10 .00 Low NC

High NC

Need for cognition

Figure 1. Verdict as a function of internal validity and need for cognition (NC).

Verdict As predicted, the internal validity of the expert’s study and NC interacted to affect jurors’ liability judgments, F(1, 146) = 3.99, p < .05, partial h2 = .03. High-NC jurors were more likely to find in favor of the plaintiff when they read the valid versus invalid version of the expert’s study, F(1, 146) = 3.86, p = .05, partial h2 = .03 (50% and 29% for valid and invalid versions, respectively). Low-NC jurors’ verdicts did not vary as a function of internal validity, F(1, 146) = .80, p = .371, partial h2 = .01 (32% and 42% for valid and invalid versions, respectively; see Figure 1). Ecological validity and general acceptance did not interact with NC as hypothesized (see Table 2). A logistic regression analysis of the verdict data confirmed the ANOVA results. We began by entering each of the main effects (i.e., ecological validity, general acceptance, internal validity, and NC) into the logistic regression model. Next, we used an indicator-variable coding scheme to create a new variable for every possible interaction and higher order interactions between those four variables. We entered all indicator variables into the model and used the backward-stepwise function to eliminate variables from the model. The final model consisted of the four main effect variables and one indicator variable representing the interaction between internal validity and NC. The effect for that indicator variable was statistically significant, c2(1, N = 162) = 3.97, p < .05, f = .17. The log-likelihood ratio indicates that high-NC jurors who read the valid version were nearly 4 times (3.81) more likely to find in favor of the plaintiff than were jurors who did not satisfy those conditions (i.e., low-NC jurors who read either valid or invalid versions and high-NC jurors who read the invalid version).

398 MCAULIFF AND KOVERA Table 2 ANOVA Tables for Nonsignificant Interactions Between Need for Cognition, Ecological Validity, General Acceptance, and Internal Validity for Main Dependent Measures Dependent measure and source Verdict NC ¥ Ecological Validity NC ¥ General Acceptance Error Evidence quality NC ¥ Ecological Validity NC ¥ General Acceptance Error Expert credibility NC ¥ Ecological Validity NC ¥ General Acceptance NC ¥ Internal Validity Error Expert trustworthiness NC ¥ Ecological Validity NC ¥ General Acceptance NC ¥ Internal Validity Error Plaintiff credibility NC ¥ Ecological Validity NC ¥ General Acceptance NC ¥ Internal Validity Error Plaintiff trustworthiness NC ¥ Ecological Validity NC ¥ General Acceptance NC ¥ Internal Validity Error

df

F

p

h2

1 1 146

1.85 0.00

.176 .983

.01 .00

1 1 146

0.56 0.00

.455 .946

.00 .00

1 1 1 145

0.16 1.19 0.23

.690 .277 .629

.00 .01 .00

1 1 1 145

0.96 2.40 0.17

.329 .123 .678

.01 .02 .00

1 1 1 145

0.02 0.41 0.63

.893 .524 .428

.00 .00 .00

1 1 1 144

0.24 0.16 1.38

.623 .693 .242

.00 .00 .01

Evidence quality ratings

JUROR NC AND EXPERT EVIDENCE

399

5.00 4.60

4.43

4.35

4.26

4.20 3.73

3.80

No control Control

3.40 3.00 Low NC

High NC

Need for cognition

Figure 2. Ratings of expert evidence quality as a function of internal validity and need for cognition (NC).

Quality of Expert Evidence Internal validity and NC influenced jurors’ evidence quality ratings, F(1, 146) = 4.09, p < .05, partial h2 = .03. High-NC jurors rated the quality of the valid study higher than that of the invalid study, F(1, 146) = 4.42, p < .05, partial h2 = .03 (Ms = 4.35 and 3.73 for valid and invalid versions, respectively). The presence or absence of a control group did not affect the evidence quality ratings of low-NC jurors, F(1, 146) = 0.54, ns, partial h2 = .00 (control present, M = 4.26; control absent, M = 4.43; see Figure 2). Ecological validity and general acceptance did not interact with NC as predicted (see Table 2).

Credibility and Trustworthiness of Witnesses The main effects of ecological validity, general acceptance, and internal validity were nonsignificant. None of these variables interacted with NC as predicted (see Table 2).

Correlations Among Dependent Measures Jurors’ ratings of study quality were positively related to their verdicts and to their ratings of plaintiff credibility, plaintiff trustworthiness, and expert credibility (see Table 3). The more favorably jurors viewed the expert’s study, the more likely they were to find the defendant liable for a hostile work environment. In addition, the more favorably jurors evaluated the expert’s study, the more favorably they evaluated the plaintiff and the expert. Jurors’ verdicts were positively related to the plaintiff’s credibility, the plaintiff’s trustworthiness, and the expert’s credibility.

400 MCAULIFF AND KOVERA Table 3 Means and Correlations Among Dependent Measures

1. 2. 3. 4. 5. 6.

Verdict Plaintiff credibility Plaintiff trustworthiness Expert credibility Expert trustworthiness Expert evidence quality

M

SD

1

2

3

4

5

0.36 3.91 3.61 5.42 5.41 3.98

0.48 1.05 1.31 0.98 1.27 1.32

— .62* .53* .18* .05 .41*

— .68* .20* .06 .27*

— .12 .05 .35*

— .68* .27*

— .09

* p ⱕ .05.

Discussion Systematic Processing of Expert Evidence We tested two hypotheses involving NC and jurors’ ability to reason about expert evidence effectively. Hypothesis 1 predicted that jurors would be able to identify flawed expert evidence, but that their ability to do so would be moderated by NC. Our results support this hypothesis: High-NC jurors rated study quality more positively and found in favor of the plaintiff more often when the study was valid than when it was missing a control group. In contrast, low-NC jurors’ evaluations were unaffected by variations in the internal validity of the expert evidence. High-NC jurors may have engaged in more systematic processing of the expert evidence than low-NC jurors for several reasons. First, high-NC jurors are more likely to enjoy and engage in cognitively complex tasks than are low-NC jurors. Therefore, they may be more willing to expend cognitive resources thinking about the scientific aspects of the expert’s evidence (Cacioppo et al., 1996). Second, by virtue of being drawn to cognitively complex tasks in the past, high-NC jurors may be more likely than low-NC jurors to possess the reasoning skills necessary to evaluate the quality of research contained in an expert’s testimony. That is, because high-NC jurors enjoy critical-thinking and problem-solving tasks, they may have increased knowledge of potential problems involving scientific evidence (e.g., missing control group) as compared to low-NC jurors. However, because our study did not include separate measures of juror ability (e.g., reasoning skills test) and motivation (e.g., self-report motivation rating, length of time spent

JUROR NC AND EXPERT EVIDENCE

401

evaluating the evidence), we cannot conclude that high-NC jurors’ increased sensitivity to the study’s internal validity was solely a function of their ability or motivation. Our findings are largely consistent with those of previous basic and applied research on NC. Meta-analytic results of the social psychological literature on persuasion have revealed that variations in argument quality have a greater effect on high- versus low-NC individuals’ ratings (Cacioppo et al., 1996). With respect to more applied work, although Bornstein (2004) did not manipulate evidence quality within each type of expert testimony that he presented to mock jurors (anecdotal case history vs. experimental), he did find that low-NC participants were influenced by the more heuristic-type anecdotal evidence whereas high-NC jurors were not. Our results differ somewhat from those obtained by Leippe et al. (2004), who observed a curvilinear relationship between NC and case strength for verdict as opposed to the linear relationship between NC and evidence quality that emerged in the present study. Those researchers predicted that high-NC jurors’ increased elaboration and systematic processing of a criminal case would provide more of a basis for reasonable doubt, compared to moderate-NC jurors when the prosecution’s case was strong. Thus, high-NC jurors would be more advantageous to the side with the weaker case (i.e., the defense) than moderate- or low-NC jurors. Indeed, Leippe et al.’s data supported their hypothesis. Compared to moderate-NC jurors, high-NC jurors had lower perceptions of guilt when the case was strong and higher perceptions of guilt when the case was weak. Why, in comparison to jurors with lower NC, were high-NC jurors in Leippe et al.’s (2004) study less likely to convict when the case was strong, whereas high-NC jurors in the present study were more likely to find the defendant liable when the expert evidence was valid? We believe these contradictory findings are a result, in large part, of the different evidentiary standards used for defendant guilt in Leippe et al.’s criminal trial (i.e., beyond a reasonable doubt) and defendant liability in our civil trial (i.e., preponderance of the evidence). Weak or invalid evidence is much more damaging when jurors must decide whether the case was proven beyond a reasonable doubt than by a preponderance of the evidence. Recognizing the strengths of the other side’s case or identifying flaws in the prosecution’s expert evidence via increased elaboration may be enough to shift the balance in favor of acquittal under the criminal standard, but it is much more difficult to do so under the civil standard. Additional research examining the effects of NC and case strength/evidence quality on verdict within the context of a criminal case (e.g., murder) and its civil counterpart (e.g., wrongful death) is necessary to evaluate the validity of this post hoc explanation.

402 MCAULIFF AND KOVERA Heuristic Processing of Expert Evidence Hypothesis 2 predicted that low-NC jurors would rely on the messagerelated cues of ecological validity and general acceptance when evaluating the evidence because they lacked the ability and motivation to systematically scrutinize the expert’s study. Our results do not support this hypothesis. One potential explanation regarding the lack of effects for low-NC jurors involves the sufficiency principle of the HSM (Chaiken, 1980, Chaiken et al., 1989). According to the HSM, message perceivers have competing interests in that they want to be accurate, yet must be economy-minded consumers of information because of the limited availability of cognitive resources. Perceivers must strike a balance between those two competing interests and that compromise is reflected in the sufficiency principle. The sufficiency principle maintains that perceivers will exert as much cognitive effort as is required (and possible) to be sufficiently confident that they are accurate. In our study, low-NC jurors’ sufficiency threshold may have been low enough that a decision based on any expert testimony, irrespective of its quality, was sufficient to satisfy their accuracy goals. This would reflect a “Good enough for the expert, good enough for me” type of reasoning. Even if jurors were motivated to process the expert evidence, their ability to do so may have been constrained by the complexity of the reasoning task and the message-related cues included in both studies. The message-related heuristic cues, as operationalized in the present research, may have been too methodologically oriented for low-NC jurors to comprehend. Consider how the heuristic cue manipulations we used differ from those that are typically described in the social cognitive literature (e.g., “Experts can be trusted,” “Length implies strength”). Those heuristics do not involve any type of methodological judgment whatsoever. Arguments and other relevant information are evaluated at face value based on whether they fit the relevant heuristic. The heuristic cues included in the present research also involve face-value judgments based on certain characteristics (e.g., consensus, representativeness), but those judgments are more methodologically oriented: Are the study participants similar to the population to which the expert wishes to generalize? Was the study peer-reviewed? Traditional heuristics may be less obvious in the context of psychological science and may require at minimum a superficial knowledge of research methodology. Several characteristics of the present study increase our confidence that the null effects associated with the ecological-validity and generalacceptance manipulations were not statistical artifacts. First, jurors’ responses to the manipulation checks indicate that they attended to the experimentally manipulated variables when reading the expert testimony. Second, post hoc power analyses confirm that our study had sufficient

JUROR NC AND EXPERT EVIDENCE

403

power to detect the hypothesized differences resulting from the NC ¥ Ecological Validity and NC ¥ General Acceptance interactions. Power was equal to .89 to detect a medium-sized effect, given the number of participants in our sample (a = .05). Third, as further evidence supporting the statistical power of the tests, differences relatively small in size (partial h2 = .03; Cohen, 1988) reached traditional levels of statistical significance in several of the ANOVAs used to examine our data. Finally, there was no evidence that the null effects were the result of a restricted response range. Jurors’ responses varied greatly within and among the various dependent measures, and there was no evidence of a floor or ceiling effect, as most means fell more toward the scale midpoint than toward either extreme. For these reasons, we are confident that the null effects associated with the ecological-validity and general-acceptance manipulations reflect a true lack of differences in jurors’ evaluations of expert evidence rather than a statistical artifact.

Correlations Among Dependent Measures Overall, there was a positive relation between jurors’ evaluations of expert evidence quality and their verdicts. On the one hand, those results are encouraging when we consider that high-NC jurors were able to identify flawed research. Based on high-NC jurors’ sensitivity to the internal validity of the expert evidence, we can expect those jurors to make better decisions when relying on their evaluations of the study’s quality. On the other hand, the fact that jurors’ evaluations of expert evidence were positively related to their verdicts is discouraging because low-NC jurors failed to differentiate between the internally valid and invalid studies. That is, those jurors were unable to recognize the lack of an appropriate control group, but still used that information when rendering their verdicts.

Limitations and Research Implications Certain limitations govern our ability to generalize our findings to situations in which jurors are asked to reason about scientific evidence. First, many of the NC differences observed (although statistically significant) were small in size, with partial h2 values less than .05, thus explaining less than 5% of the variance in participants’ responses. It is clear that other dispositional or situational variables affect jurors’ ability to identify flawed scientific evidence. Second, our written trial summary, although detailed and realistic, constitutes a relatively impoverished stimulus, compared to

404 MCAULIFF AND KOVERA the courtroom experience of jurors in a real case. It lacks certain sourcerelated (i.e., expert credentials) and audience-related (i.e., reactions of other jurors) cues that might interact with internal-validity manipulations to affect jurors’ decisions. Also, jurors rendered liability verdicts independent of one another. Previous research has suggested, however, that jurors’ pre- and post-deliberation verdicts often do not differ (Hastie, Penrod, & Pennington, 1983), that jurors rarely discuss the expert or the expert’s testimony during deliberations (Kovera, Gresham, Borgida, Gray, & Regan, 1997), and that high-NC jurors do not make substantively better arguments during group deliberations than do low-NC jurors, despite being more talkative and active and appearing to be more persuasive (Shestowsky & Horowitz, 2004). Despite its limitations, the present research has theoretical implications for future research examining people’s ability to reason about scientific evidence. Previous research on scientific reasoning has relatively neglected the role of individual differences in the ability to evaluate research methods. Our research suggests that NC is one individual difference that moderates people’s ability to identify basic flaws (e.g., missing control group) in psychological science. Future research could identify additional individual differences that moderate people’s reasoning abilities, as well as examine the independent and interactive roles that ability and motivation play in high-NC jurors’ ability to reason about psychological science. Are high-NC jurors able to identify flawed expert evidence because of increased motivation alone or do they also possess an increased knowledge about methodological and statistical issues? The present research has several practical implications for trials containing psychological science. First, NC appears to play an important role in jurors’ evaluations of scientific evidence. Perhaps attorneys and judges can increase the likelihood of seating jurors who are able to critically evaluate scientific evidence by using the revised 18-item NCS scale (Cacioppo, Petty, & Kao, 1984) during the jury-selection process. This outcome should lead to more effective juror decision making and increase the likelihood that justice is served in cases containing psychological science. Second, because trial strategy may lead some attorneys to seek jurors who will find it difficult to evaluate scientific methodology, jurors who lack the ability to evaluate psychological science without additional assistance may continue to decide cases. Therefore, research is needed to examine the effectiveness of legal safeguards, such as cross-examination and judicial instructions against the influence of unreliable scientific evidence on juror decisions. Such research may help courts to accommodate jurors’ reasoning skills better in future civil and criminal trials containing psychological science.

JUROR NC AND EXPERT EVIDENCE

405

References Axsom, D., Yates, S., & Chaiken, S. (1987). Audience response as a heuristic cue in persuasion. Journal of Personality and Social Psychology, 53, 30– 40. Bar-Hillel, M. (1980). The base-rate fallacy in probability judgments. Acta Psychologica, 44, 211–233. Bornstein, B. H. (2004). The impact of different types of expert scientific testimony on mock jurors’ liability verdicts. Psychology, Crime, and Law, 10, 429–446. Cacioppo, J. T., & Petty, R. E. (1982). The need for cognition. Journal of Personality and Social Psychology, 42, 116–131. Cacioppo, J. T., Petty, R. E., Feinstein, J. A., & Jarvis, W. B. G. (1996). Dispositional differences in cognitive motivation: The life and times of individuals varying in the need for cognition. Psychological Bulletin, 119, 197–253. Cacioppo, J. T., Petty, R. E., & Kao, C. F. (1984). The efficient assessment of need for cognition. Journal of Personality Assessment, 48, 306– 307. Cacioppo, J. T., Petty, R. E., Kao, C. F., & Rodriguez, R. (1986). Central and peripheral routes to persuasion: An individual difference perspective. Journal of Personality and Social Psychology, 51, 1032–1043. Chaiken, S. (1980). Heuristic versus systematic information processing and the use of source versus message cues in persuasion. Journal of Personality and Social Psychology, 39, 752–766. Chaiken, S., Liberman, A., & Eagly, A. (1989). Heuristic and systematic information processing within and beyond the persuasion context. In J. S. Uleman & J. A. Bargh (Eds.), Unintended thought (pp. 212–251). New York: Guilford. Chaiken, S., & Maheswaran, D. (1994). Heuristic processing can bias systematic processing: Effects of source credibility, argument ambiguity, and task importance on attitude judgment. Journal of Personality and Social Psychology, 66, 460–473. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Cooper, J., Bennett, E. A., & Sukel, H. L. (1996). Complex scientific testimony: How do jurors make decisions? Law and Human Behavior, 20, 379–394. Cooper, J., & Neuhaus, I. M. (2000). The “hired gun” effect: Assessing the effect of pay, frequency of testifying, and credentials on the perception of expert testimony. Law and Human Behavior, 24, 149–171. Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993).

406

MCAULIFF AND KOVERA

Faigman, D. L., & Baglioni, A. J. (1988). Bayes’ theorem in the trial process: Instructing jurors on the value of statistical evidence. Law and Human Behavior, 12, 1–17. Fong, G. T., Krantz, D. H., & Nisbett, R. E. (1986). The effects of statistical training on thinking about everyday problems. Cognitive Psychology, 18, 253–292. Frye v. United States, 54 App. D.C. 46, 293 F. 1013 (1923). Gatowski, S. I., Dobbin, S. A., Richardson, J. T., Ginsburg, G. P., Merlino, M. L., & Dahir, V. (2001). Asking the gatekeepers: A national survey of judges on judging expert evidence in a post-Daubert world. Law and Human Behavior, 25, 433–458. General Electric Co. et al. v. Joiner et ux., 522 U.S. 136 (1997). Gray, T., & Mill, D. (1990). Critical abilities, graduate education (biology versus English), and belief in unsubstantiated phenomena. Canadian Journal of Behavioural Science, 22, 162–172. Groscup, J. L., Penrod, S. D., Studebaker, C. A., Huss, M. T., & O’Neil, K. M. (2002). The effects of Daubert on the admissibility of expert testimony in state and federal criminal cases. Psychology, Public Policy, and Law, 8, 339–372. Hamill, R., Wilson, T. D., & Nisbett, R. E. (1980). Insensitivity to sample bias: Generalizing from atypical cases. Journal of Personality and Social Psychology, 39, 578–589. Hastie, R., Penrod, S. D., & Pennington, N. (1983). Inside the jury. Cambridge, MA: Harvard University Press. Jepson, C., Krantz, D. H., & Nisbett, R. E. (1983). Inductive reasoning: Competence or skill? Behavioral and Brain Sciences, 6, 494–501. Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of representativeness. Cognitive Psychology, 3, 430–454. Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 237–251. Kassin, S. M., Ellsworth, P. C., & Smith, V. L. (1989). The “general acceptance” of psychological research on eyewitness testimony: A survey of the experts. American Psychologist, 44, 1089–1098. Kaye, D. H., & Koehler, J. J. (1991). Can jurors understand probabilistic evidence? Journal of the Royal Statistical Society: Series A, 154 (Part 1), 75–81. Kovera, M. B., Gresham, A. W., Borgida, E., Gray, E., & Regan, P. C. (1997). Does expert testimony inform or influence juror decision making? A social cognitive analysis. Journal of Applied Psychology, 82, 178–191. Kovera, M. B., & McAuliff, B. D. (2000). The effects of peer review and evidence quality on judge evaluations of psychological science: Are judges effective gatekeepers? Journal of Applied Psychology, 85, 574–586.

JUROR NC AND EXPERT EVIDENCE

407

Kovera, M. B., McAuliff, B. D., & Hebert, K. S. (1999). Reasoning about scientific evidence: Effects of juror gender and evidence quality on juror decisions in a hostile work environment case. Journal of Applied Psychology, 84, 362–375. Kumho Tire Co., Ltd., et al. v. Carmichael et al., 526 U.S. 137 (1999). Leippe, M. R., Eisenstadt, D., Rauch, S. M., & Seib, H. M. (2004). Timing of eyewitness expert testimony, jurors’ need for cognition, and case strength as determinants of trial verdicts. Journal of Applied Psychology, 89, 524– 541. Maheswaran, D., & Chaiken, S. (1991). Promoting systematic processing in low-motivation settings: Effect of incongruent information on processing and judgment. Journal of Personality and Social Psychology, 61, 13–25. Mill, D., Gray, T., & Mandel, D. R. (1994). Influence of research methods and statistics courses on everyday reasoning, critical abilities, and belief in unsubstantiated phenomena. Canadian Journal of Behavioural Science, 26, 246–258. Niedermeier, K. E., Kerr, N. L., & Messé, L. A. (1999). Jurors’ use of naked statistical evidence: Exploring bases and implications of the Wells effect. Journal of Personality and Social Psychology, 76, 533–542. Petty, R. E., & Cacioppo, J. T. (1984). The effects of involvement on responses to argument quantity and quality: Central and peripheral routes to persuasion. Journal of Personality and Social Psychology, 46, 69–81. Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 19, pp. 123–203). New York: Academic Press. Petty, R. E., Cacioppo, J. T., & Goldman, R. (1981). Personal involvement as a determinant of argument-based persuasion. Journal of Personality and Social Psychology, 41, 847–855. Robinson v. Jacksonville Shipyards, Inc., 760 F.Supp. 1486 (M.D. Fla. 1991). Rudman, L. A., & Borgida, E. (1995). The afterglow of construct accessibility: The behavioral consequences of priming men to view women as sexual objects. Journal of Experimental Social Psychology, 31, 493–517. Schklar, J., & Diamond, S. S. (1999). Juror reactions to DNA evidence: Errors and expectancies. Law and Human Behavior, 23, 159–184. Shestowsky, D., & Horowitz, L. H. (2004). How the Need for Cognition scale predicts behavior in mock jury deliberations. Law and Human Behavior, 28, 305–337. Shuman, D. W., Whitaker, E., & Champagne, A. (1994). An empirical examination of the use of expert witnesses in the courts: Part II. A three-city study. Jurimetrics, 34, 193–208.

408 MCAULIFF AND KOVERA Simonson, I., & Nye, P. (1992). The effect of accountability on susceptibility to decision errors. Organizational Behavior and Human Decision Processes, 51, 416–446. Stanovich, K. E., & West, R. F. (1998). Who uses base rates and P(D/~H)? An analysis of individual differences. Memory and Cognition, 26, 161–179. Thompson, W. C., & Schumann, E. L. (1987). Interpretation of statistical evidence in criminal trials: The prosecutor’s fallacy and the defense attorney’s fallacy. Law and Human Behavior, 11, 167–187. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124–1131. Wells, G. L. (1992). Naked statistical evidence of liability: Is subjective probability enough? Journal of Personality and Social Psychology, 62, 739–752.

Suggest Documents