Bayesian Evaluation of Informative Hypotheses

0 downloads 0 Views 181KB Size Report
With one test the probability of a wrong decision if H0 is true is .05. With six ... Note in the sequel the presence of multiple alternative hypotheses: “something is ...
Bayesian Evaluation of Informative Hypotheses Herbert Hoijtink and Rens van de Schoot Department of Methods and Statistics University Utrecht [email protected] [email protected]

Are you happier if a p-value is .049 rather than .051?

1. Yes !! 2. H0: m1=m2=m3 Ha: not H0 But is .049 more support for Ha than .051? 3. R.A. Fisher wrote in a margin: “for example .05” 4. .05 roughly implies that PMP(H0)=.25 and PMP(Ha)=.75

1

Did you ever have trouble finding a meaningful interpretation upon finding one or more significant test results? Main A Main B Main C

.01 .34 .06

AxB AxC BxC AxBxC

A=2

A=1

B=1

B=2

.03 .56 .20 .02

C=1

C=1

C=2

C=2

B=3

B=1

B=2

B=3

The Bayesian solution is not “to let the data speak” but “to specify one or more sets of relations between the means before the data are observed, and to let the data decide only if and which any of these sets of relations are supported by the data” e.g. Hi1: m1 > m2 > m3 or Hi2: m1 < m2 = m3 and subsequently compute PMP’s e.g. .92 and .08 for Hi1 and Hi2.

2

Did you ever worry about the interpretation of p-values when testing more than one hypothesis?

With one test the probability of a wrong decision if H0 is true is .05 With six tests the probability of one or more wrong decisions is about .26. This renders too many false positives. Corrections like Bonferroni i.e. .05/6? This renders too many false negatives, that is, a severe loss of power. Do you ever test more than one hypothesis? Yes! Do you correct or not. Do you report all the hypotheses tested in you paper? The Bayesian solution is to specify a few competing hypotheses and to determine the support in the data for each hypothesis.

Do you like large sample sizes because more tests will be significant? Yes! The larger the sample the better “the view” of the population. For large sample sizes all tests are significant. Consequently all one can do is determine what is going on via the evaluation of effect sizes. But, this is rather ad hoc, subjective, and appropriate effect-sizes are not always available. This is more or less the same problem as the interpretation problem discussed earlier. Again with a Bayesian approach you just determine the support for each hypothesis in the data.

3

Did you ever quit a research project because none of the tests were significant?

Compare

With

H0: m1=m2=m3

Hi1: m1 = m2 = m3

versus Ha: not H0

versus Hi2: m1 > m2 > m3

In which situation do you have more power?

Examples of Informative Hypotheses Note in the sequel the absence of the traditional null-hypothesis: “nothing is going on” Note in the sequel the presence of multiple alternative hypotheses: “something is going on and I have a fair idea what is going on” Which is different from the traditional alternative hypothesis: “something is going on but I don’t know what”

4

Example 1: Lucas (2003). Status processes and the institutionalization of women as leaders. American Sociological Review, 68, 468-480.

Lucas (2003) describes an experiment in which measures how competent the participants consider their leader to be. There are five conditions: 1) A man is appointed leader at random 2) A women is appointed leader at random 3) A man is appointed leader on account of skills proven to the group 4) A women is appointed leader on account of skills proven to the group 5) A group where female leadership is institutionalized as normal and legitimate, and in which subsequently a woman is appointed leader

Descriptive Statistics Dependent Variable: y

Random Male Leader Random Female Leader Skilled Male Leader Skilled Female Leader Institutionalized Female Leader

group 1 2 3 4 5

Mean 2.3300 1.3300 3.1999 2.2299 3.2300

Std. Deviation 1.86000 1.15000 1.79001 1.45000 1.50000

N 30 30 30 30 30

5

Lucas (2003) wants to know whether the mean competence scores have the following structure:

H1:

μ 2 < μ1 < μ 4 < μ3 = μ5 or rather

H2:

μ1 , μ3 > μ 2 , μ 4 , μ5 or

H3:

μ1 , μ 2 , μ3 , μ 4 , μ5

A Bayesian analysis could render PMP’s of .75, .05 and .20, respectively for H1, H2, and H3

Shortly repeat advantages and disadvantages op PMP’s and p-values.

Example 2: Hasel, L.E, Kassin, S.M. (2009). On the assumption of evidentiary. Can confessions corrupt eyewitness identifications? Psychological Science, 20, 122-126.

Participants in the experiment witnessed a theft. Each participant had to identify the thief from a six-person target-absent photographic lineup. The participants who made an identification gave a confidence rating (Phase a) for their identification. After two days these participants were randomly assigned to four conditions. In Condition 1 they were told that the person they identified confessed, in Condition 2 that all suspects denied, in Condition 3 that the person they identified denied, and in Condition 4 that another person confessed. Hereafter these participants gave another confidence rating (Phase b) for their identification.

6

Hassel and Kassin tested the provocative hypothesis that a confession will lead eyewitnesses to change ... their confidence in those [identification] decisions". The hypothesis implied in their paper is:

H i : μb1 > μ a1 , μb 2 < μ a 2 , μb 3 < μ a 3 , μb 4 < μ a 4

μb1 > μb 2 > μb 3 > μb 4 that is, if the person identified confesses the confidence increases, in the other three conditions the confidence decreases, and, the confidence rating in Phase b decreases from Condition 1 to Condition 4. The latter is logical because the amount of secondary evidence provided for the identification made by a person decreases from Condition 1 to Condition 4.

7

But … what is the hypothesis to which Hi should be compared!? Not the traditional null-hypothesis, Hasel and Kassin do not at all believe that it is possible that all means are equal. Since they have one clear expectation, the choice should be: Hc, that is “not Hi”. The resulting PMP’s are .98 and .02, respectively for Hi and Hc.

The Bayesian Alternative for Hypothesis Testing Consider the following three hypotheses:

H 1 : μ1 ≈ μ 2 H 2 : μ1 > μ 2 H 3 : μ1 , μ 2 The information in these hypotheses can be formalized in so-called “prior distributions” for each hypothesis. These prior distributions represent the information with respect to the means implied by each hypothesis.

8

H 1 : μ1 ≈ μ 2

μ2

H 2 : μ1 > μ 2

μ2

H 3 : μ1 , μ 2

μ2

μ1

μ1

μ1

Data Perceived academic competence scores for 43 boys Mother’s work gives opportunity to solve problems 36.9 34.6 26.4 33.3 35.4 34.8 32.3 34.5 36.0 24.5 31.6 36.1 36.8 27.9 34.4 33.8 36.9 34.4 31.7 29.4 34.1 18.2 34.5 35.3 35.5 33.4 Mother’s work does not give opportunity to solve problems 23.5 22.5 36.4 40.0 30.6 30.5 34.5 31.1 19.4 29.6 24.8 25.0 28.8 32.5 33.3 29.6 24.5 Mourehouse and Sanders (1992). Children’s feelings of … Social Developments, 1, 185-200.

9

Distribution of the data can be used to provide bounds for the prior of the μ ' s Here for each μ a lower bound of 26 and an upperbound of 36 will be used

H 3 : μ1 , μ 2 36 31 26 26

31

36

10

Support in the Data for the Hypothesis at Hand

H 1 : μ1 = μ 2

μ2

31 31

μ1

H 3 : μ1 , μ 2

31

μ2

31

μ2

H 2 : μ1 > μ 2

31

μ1

31

μ1

The ellipses are the isodensity curves of the likelihood/posterior of the μ ' s given the data Discuss the support of the likelihood for each of the three hypotheses under consideration in terms of fit and model size.

11

The Bayes Factor, Posterior Probabilities Let 1/cM denotes the proportion of the unconstrained prior in agreement with model M, and 1/dM the proportion of the unconstrained posterior in agreement with model M 1/cM can be interpreted as model complexity and 1/dM as model fit

BFM 3 =

1/ dM 1 / cM

that is the relative support in the data for the constrained model with respect to the unconstrained model

H 1 : μ1 = μ 2

H 3 : μ1 , μ 2

μ2

31

31

μ1

31

μ2 31

μ2

H 2 : μ1 > μ 2

31

μ1

31

1/c=.50 1/d=.99

1/c=.022 1/d=.003

μ1

1/c=1.0 1/d=1.0

BF13 = .003 / .022 = .137

BF23 = .99 / .50 = 2.0

PMP1 = PMP1 =

.137 = .04 .137 + 2.0 + 1

2.0 = .64 .137 + 2.0 + 1 PMP1 =

1 = .32 .137 + 2.0 + 1

12

Interpretation of Posterior Model Probabilities

Conditional Error Probabilities: What is the probability of making an error after observing the data? Which is not the same as an error probability of .05 if H0 is true. BUT: Posterior probabilities are not necessarily frequentist probabilities

Degree of Support: By weighting fit and model size as illustrated earlier.

13

Suggest Documents